File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-2717_concl.xml
Size: 2,292 bytes
Last Modified: 2025-10-06 13:55:46
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2717"> <Title>XML-based Phrase Alignment in Parallel Treebanks</Title> <Section position="7" start_page="94" end_page="95" type="concl"> <SectionTitle> 6 Conclusion </SectionTitle> <Paragraph position="0"> We have shown a straightforward way to tie in XML-based phrase alignment information with syntax trees represented in TIGER-XML. The alignment information is stored independently from the treebank files. This independence allows for a modularization and separation of the annotation but it entails that the synchronization of the 5The final result of an m:n tree alignment can be visualized with an SVG-based display which we have described in (Samuelsson and Volk, 2005). SVG (Scalable Vector Graphics) describes vector graphics in XML.</Paragraph> <Paragraph position="1"> treebanks with the alignment needs to be guarded separately. If any of the treebanks is modified, the modification of the alignment needs to follow.</Paragraph> <Paragraph position="2"> We have argued for the use of a graphical TreeAligner to display and interactively modify the alignment between parallel syntax trees. The TreeAligner allows for m:n sentence alignment, word alignment and node alignment. And it supports the distinction between exact and approximate alignments.</Paragraph> <Paragraph position="3"> As a next step we plan to integrate a component for automatic phrase alignment into the TreeAligner. The user can then select a tree pair and will get automatic phrase alignment predictions. We have already experimented with the projection of automatically computed word alignments to predict phrase alignment. Of course, the automatic phrase alignment has to be manually checked if we want to ensure high quality alignment data.</Paragraph> <Paragraph position="4"> Another avenue of further research is the inclusion of yet more levels of annotation. For example, we are currently experimenting with the annotation of semantic frames on top of the treebanks. We use the SALSA tool developed at Saarbr&quot;ucken University (Erk and Pado, 2004) which also assumes TIGER-XML input. So, TIGER-XML has become the lingua franca of treebank annotation which allows for the addition of arbitrary layers.</Paragraph> </Section> class="xml-element"></Paper>