File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-2112_concl.xml
Size: 1,532 bytes
Last Modified: 2025-10-06 13:54:21
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2112"> <Title>R{j}ecnik.com: English--Serbo-Croatian Electronic Dictionary</Title> <Section position="6" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Conclusions and Future Work </SectionTitle> <Paragraph position="0"> We have presented the features of an electronic English-SC dictionary. The dictionary is designed to be multi-functional, providing the interfaces to produce a printed dictionary copy and an on-line searchable lexicon. We propose a dictionary structure inspired by the WordNet, which is flexible and easy to maintain. We also report the site statistics of the on-line dictionary during the last five years.</Paragraph> <Paragraph position="1"> Future work. The plan for future work includes incorporating a lemmatizer that would translate inflected word forms into their canonical representations. This is relevant for English, but it is a more important issue in SC, which is a highly-inflectional language. We do not know of any lemmatizer or stemmer currently available for SC. The software interfaces for producing a wordnet form, and a TEI-encoded form will be developed. An issue of long queries needs to be addressed. Currently, if a user submits a long query, which is usually a sentence or paragraph, the dictionary reports &quot;zero entries found.&quot; A fall-back strategy should be provided, which will consist of tokenizing the input and giving the results on querying separate lexemes.</Paragraph> </Section> class="xml-element"></Paper>