Vertical ACL ARC 2.0 | Behrang Q. Zadeh | Senior Data Scientist at Henkel's Center of Data Analytics

The ACL ARC 2.0 can be browsed on-line using the NoSketch Engine from this link. This version of the corpus is derived from processing (part-of-speech tagging and restructuring) of the ParsCit's output included in the official release of the ACL ARC 2.0. The vertical corpus and its accompanied registry file are available from this link (tar.gz format).

OpenNLP is used for segmenting sentences. Stanford CoreNLP is used for tokenisation and part-of-speech tagging.

Example queries:

Concordances of the lemma "embedding" within the method sections using the CQL query [lemma="embedding"] within <section type="method"/>.

This page last edited on 11 October 2025.