Skip to content

Commit

Permalink
resolved conflict in software list
Browse files Browse the repository at this point in the history
  • Loading branch information
hennyu committed Feb 20, 2024
2 parents 50b6b90 + aba3da0 commit bb8b822
Show file tree
Hide file tree
Showing 6 changed files with 105 additions and 51 deletions.
95 changes: 54 additions & 41 deletions data/JTEI/14_2021-23/jtei-cc-pn-erjavec-195-source.xml
Original file line number Diff line number Diff line change
Expand Up @@ -207,10 +207,11 @@
<div xml:id="schema">
<head>The Parla-CLARIN Schema</head>
<p>Parla-CLARIN is written as a TEI ODD document, consisting of the prose guidelines and
the schema specification, on the basis of which it is possible, using the standard TEI
XSLT stylesheets, to derive an XML schema expressed either as a RelaxNG schema, a DTD,
or a W3C schema, which is then used for formal validations of a Parla-CLARIN
parliamentary corpus.</p>
the schema specification, on the basis of which it is possible, using the <ptr
type="software" xml:id="R5" target="#teistylesheets"/><rs type="soft.name" ref="#R5"
>standard TEI XSLT stylesheets</rs>, to derive an XML schema expressed either as a
RelaxNG schema, a DTD, or a W3C schema, which is then used for formal validations of a
Parla-CLARIN parliamentary corpus.</p>
<p>While the proposal tries to cater for many encoding needs, it is possible that new
users will have to use TEI elements or attributes that are not discussed in the prose
guidelines. Since the recommendations are still under development, the formal schema
Expand Down Expand Up @@ -324,20 +325,22 @@
<div xml:id="presentation">
<head>Presentation of Parla-CLARIN</head>
<p>Like the TEI Guidelines, the Parla-CLARIN recommendations are available on <ref
target="https://github.com/clarin-eric/parla-clarin/"><ptr type="software"
xml:id="GitHub" target="#GitHub"/><rs type="soft.name" ref="#GitHub"
>GitHub</rs></ref>, as a project<note>Tomaž Erjavec and Andrej Pančur, Parla-CLARIN
project <ptr type="software" xml:id="GitHub" target="#GitHub"/><rs type="soft.name"
ref="#GitHub">GitHub</rs> site, last updated March 17, 2021, <ptr
target="https://github.com/clarin-eric/parla-clarin/"/>.</note> of the CLARIN ERIC
collection. The project contains a folder for the schema (i.e., the Parla-CLARIN ODD
document and XML schemas derived from it), a folder for the programs that convert the
ODD into the XML schemas and to the HTML of the prose and schema definitions, and a
folder for examples, which contains an artificial but fully worked out example of a
Parla-CLARIN document and subfolders with various example resources, where each should
contain: <list rend="ordered">
target="https://github.com/clarin-eric/parla-clarin/"><ptr type="software" xml:id="R1"
target="#GitHub"/><rs type="soft.name" ref="#R1">GitHub</rs></ref>, as a
project<note>Tomaž Erjavec and Andrej Pančur, Parla-CLARIN project <ptr
type="software" xml:id="R2" target="#GitHub"/><rs type="soft.name" ref="#R2"
>GitHub</rs> site, last updated March 17, 2021, <ptr type="software" xml:id="R9"
target="#parlaclarinscripts"/><rs type="soft.url" ref="#R9"><ptr
target="https://github.com/clarin-eric/parla-clarin/"/></rs>.</note> of the CLARIN
ERIC collection. The project contains a folder for the schema (i.e., the Parla-CLARIN
ODD document and XML schemas derived from it), a folder for the <rs type="soft.name"
ref="#R9">programs that convert the ODD into the XML schemas and to the HTML of the
prose and schema definitions</rs>, and a folder for examples, which contains an
artificial but fully worked out example of a Parla-CLARIN document and subfolders with
various example resources, where each should contain: <list rend="ordered">
<item>a sample of a corpus in its source encoding;</item>
<item>XSLT script to convert it into Parla-CLARIN; and</item>
<item><rs type="soft.name" ref="#R9">XSLT script to convert it into Parla-CLARIN</rs>;
and</item>
<item>the output of the conversion.</item>
</list>
</p>
Expand Down Expand Up @@ -495,12 +498,15 @@
<p>Nevertheless, AKN is an important schema for modeling parliamentary proceedings,
especially as the primary encoding standard used by various legislative bodies, so some
of AKN’s solutions were used in developing the Parla-CLARIN proposal, in particular the
typology of divisions of a document. Also developed was a partial, but non-trivial,
conversion from AKN to Parla-CLARIN, which covers several AKN example documents. As
mentioned in <ptr type="crossref" target="#presentation"/>, the example documents and
conversion script can be found in the <ident>Examples</ident> folder of the Parla-CLARIN
Git repository. The <ident>akn2tei.xsl</ident> script attempts to preserve the IDs of
the source AKN document, converts the AKN addressee, role, and questions and answers to
typology of divisions of a document. Also developed was a partial, but non-trivial, <ptr
type="software" xml:id="R10" target="#parlaclarinscripts"/><rs type="soft.name"
ref="#R10">conversion from AKN to Parla-CLARIN</rs>, which covers several AKN example
documents. As mentioned in <ptr type="crossref" target="#presentation"/>, the example
documents and conversion script can be found in the <ident>Examples</ident> folder of
the Parla-CLARIN Git repository. The <ptr type="software" xml:id="R11"
target="#parlaclarinscripts"/><rs type="soft.name" ref="#R11"
><ident>akn2tei.xsl</ident></rs> script attempts to preserve the IDs of the source
AKN document, converts the AKN addressee, role, and questions and answers to
Parla-CLARIN, and maps FRBR data (which distinguishes a <soCalled>work</soCalled> from
its <soCalled>expression</soCalled> and its expression from its
<soCalled>manifestation</soCalled>) to the appropriate TEI elements and attributes.
Expand Down Expand Up @@ -572,9 +578,10 @@
parliamentary proceedings meant for scholarly investigations. This scheme is currently a
straightforward customization of the TEI Guidelines, with the majority of the effort
having gone into the writing of the prose guidelines of the Parla-CLARIN recommendations
and into developing the conversion from Akoma Ntoso to Parla-CLARIN. We have not included
examples of the encoding, as these are readily available on the <ptr type="software"
xml:id="GitHub" target="#GitHub"/><rs type="soft.name" ref="#GitHub">GitHub</rs>
and into developing the <ptr type="software" xml:id="R12" target="#parlaclarinscripts"
/><rs type="soft.name" ref="#R12">conversion from Akoma Ntoso to Parla-CLARIN</rs>. We
have not included examples of the encoding, as these are readily available on the <ptr
type="software" xml:id="R3" target="#GitHub"/><rs type="soft.name" ref="#R3">GitHub</rs>
documentation page of the project, and large Parla-CLARIN encoded corpora are openly
available.</p>
<p>Apart from the siParl 2.0 corpus mentioned above (<ptr type="crossref"
Expand All @@ -601,15 +608,21 @@
<p>As we wanted to have corpora that are not only interchangeable but interoperable as well,
we created a bespoke ParlaMint XML schema directly in RelaxNG – the schema is compatible
with Parla-CLARIN as it validates a subset of documents that would be validated against
Parla-CLARIN. We produced common scripts that can convert any of the four corpora to plain
text, to CoNLL-U format as used by the Universal Dependencies project, and to vertical
format as used by the <ref target="http://cwb.sourceforge.net/">CWB</ref><note>The IMS
Open Corpus Workbench (CWB), last modified March 30, 2021, <ptr
target="http://cwb.sourceforge.net/"/>.</note> and <ref
target="http://www.sketchengine.eu/">Sketch Engine</ref><note>Accessed January 13, 2022,
<ptr target="http://www.sketchengine.eu/"/>.</note> (<ref type="bibl"
target="#kilgarriff14">Kilgarriff et al. 2014</ref>) concordancers, as well as to
extract complete speech metadata into TSV files.</p>
Parla-CLARIN. We produced <ptr type="software" xml:id="R13" target="#parlaclarinscripts"
/><rs type="soft.url" ref="#R13">common scripts that can convert any of the four corpora
to plain text, to CoNLL-U format as used by the Universal Dependencies project, and to
vertical format as used by the <ptr type="software" xml:id="R14" target="#cwb"/><rs
type="soft.url" ref="#R14"><ref target="http://cwb.sourceforge.net/"
>CWB</ref></rs></rs><note>The <rs type="soft.name" ref="#R14">IMS Open Corpus Workbench
(CWB)</rs>, last modified March 30, 2021, <rs type="soft.url" ref="#R14"><ptr
target="http://cwb.sourceforge.net/"/></rs>.</note> and <ptr type="software"
xml:id="R15" target="#sketchengine"/><rs type="soft.url" ref="#R15"><ref
target="http://www.sketchengine.eu/"><rs type="soft.name" ref="#R15">Sketch
Engine</rs></ref></rs><note>Accessed January 13, 2022, <rs type="soft.url"
ref="#R15"><ptr target="http://www.sketchengine.eu/"/></rs>.</note> (<rs
type="soft.bib.ref" ref="#R15"><ref type="bibl" target="#kilgarriff14">Kilgarriff et al.
2014</ref></rs>) concordancers, as well as to extract complete speech metadata into
TSV files.</p>
<p>In order for Parla-CLARIN to achieve its goal of becoming a widely recognized encoding
format for corpora of parliamentary proceedings, significant work remains to be done. On
the basis of the lessons learned in creating ParlaMint, we plan to revise the prose
Expand All @@ -619,10 +632,10 @@
specification from the default ones in the TEI Guidelines to ones taken or adapted from
the collected parliamentary corpora.</p>
<p>Second, as we have already done for ParlaMint, we plan to add to the <ptr type="software"
xml:id="GitHub" target="#GitHub"/><rs type="soft.name" ref="#GitHub">GitHub</rs>
Parla-CLARIN project more down-conversion scripts with which we would increase the
usability of the Parla-CLARIN corpora. As mentioned, work also needs to be done to develop
a conversion to RDF.</p>
xml:id="R4" target="#GitHub"/><rs type="soft.name" ref="#R4">GitHub</rs> Parla-CLARIN
project more down-conversion scripts with which we would increase the usability of the
Parla-CLARIN corpora. As mentioned, work also needs to be done to develop a conversion to
RDF.</p>
<p>Last, but not least, one of the great benefits of Git is the ability to support
collaborative work, be it through posting issues, or through using pull requests to
incorporate changes. While the community has not so far made use of these options, we hope
Expand Down Expand Up @@ -790,8 +803,8 @@
<bibl xml:id="kilgarriff14"><author>Kilgarriff, Adam</author>, <author>Vít Baisa</author>,
<author>Jan Bušta</author>, <author>Miloš Jakubíček</author>, <author>Vojtěch
Kovář</author>, <author>Jan Michelfeit</author>, <author>Pavel Rychlý</author>, and
<author>Vít Suchomel</author>. <date>2014</date>. <title level="a">The Sketch Engine:
Ten Years On.</title>
<author>Vít Suchomel</author>. <rs type="soft.bib.ref" ref="ewfew"><date>2014</date>.
<title level="a">The Sketch Engine: Ten Years On.</title></rs>
<title level="j">Lexicography: Journal of ASIALEX</title>
<biblScope unit="volume">1</biblScope> (<biblScope unit="issue">1</biblScope>):
<biblScope unit="page">7–36</biblScope>. doi:<idno type="DOI"
Expand Down
8 changes: 7 additions & 1 deletion schema/tei_jtei_annotated.odd
Original file line number Diff line number Diff line change
Expand Up @@ -2277,7 +2277,9 @@
<valItem mode="add" ident="#webpack"/>
<valItem mode="add" ident="#elem"/>
<valItem mode="add" ident="#literaturetranslation"/>
<valItem mode="add" ident="#teitok"/>
<valItem mode="add" ident="#github"/>
<valItem mode="add" ident="#githubpages"/>
<valItem mode="add" ident="#leaflet"/>
<valItem mode="add" ident="#ugarit"/>
<valItem mode="add" ident="#smartcompose"/>
Expand Down Expand Up @@ -2384,6 +2386,9 @@
<valItem mode="add" ident="#azurecloud"/>
<valItem mode="add" ident="#gate"/>
<valItem mode="add" ident="#r"/>
<valItem mode="add" ident="#textualcommunities"/>
<valItem mode="add" ident="#visualstudiocode"/>
<valItem mode="add" ident="#scholarlyxml"/>
<valItem mode="add" ident="#igraph"/>
<valItem mode="add" ident="#textal"/>
<valItem mode="add" ident="#planthumanitiesworkbench"/>
Expand Down Expand Up @@ -2422,7 +2427,6 @@
<valItem mode="add" ident="#eppt"/>
<valItem mode="add" ident="#elwoodviewer"/>
<valItem mode="add" ident="#evt"/>
<valItem mode="add" ident="#boilerplate"/>
<valItem mode="add" ident="#tei2html"/>
<valItem mode="add" ident="#basex"/>
<valItem mode="add" ident="#tipuesearch"/>
Expand All @@ -2447,6 +2451,7 @@
<valItem mode="add" ident="#ediarum"/>
<valItem mode="add" ident="#transkribus"/>
<valItem mode="add" ident="#xproc"/>
<valItem mode="add" ident="#ceteicean"/>
<valItem mode="add" ident="#imagemarkuptool"/>
<valItem mode="add" ident="#tustep"/>
<valItem mode="add" ident="#netbeans"/>
Expand Down Expand Up @@ -2481,6 +2486,7 @@
<valItem mode="add" ident="#teipelican"/>
<valItem mode="add" ident="#odd2odd"/>
<valItem mode="add" ident="#zenodo"/>
<valItem mode="add" ident="#parlaclarinscripts"/>
</valList>
<dataRef name="anyURI" restriction="http.+|#.+|@.+|hdl.+|mailto.+"/>
</alternate>
Expand Down
18 changes: 15 additions & 3 deletions schema/tei_jtei_annotated.rng
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
xmlns="http://relaxng.org/ns/structure/1.0"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
ns="http://www.tei-c.org/ns/1.0"><!--
Schema generated from ODD source 2024-02-20T13:55:46Z. 2014.
Schema generated from ODD source 2024-02-20T15:55:29Z. 2014.
TEI Edition: P5 Version 4.7.0. Last updated on 16th November 2023, revision e5dd73ed0
TEI Edition Location: https://www.tei-c.org/Vault/P5/4.7.0/
Expand Down Expand Up @@ -2745,8 +2745,12 @@ attributes @target and @cRef may be supplied on <name/>.</report>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#literaturetranslation</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#teitok</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#github</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#githubpages</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#leaflet</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#ugarit</value>
Expand Down Expand Up @@ -2959,6 +2963,12 @@ attributes @target and @cRef may be supplied on <name/>.</report>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#r</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#textualcommunities</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#visualstudiocode</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#scholarlyxml</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#igraph</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#textal</value>
Expand Down Expand Up @@ -3035,8 +3045,6 @@ attributes @target and @cRef may be supplied on <name/>.</report>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#evt</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#boilerplate</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#tei2html</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#basex</value>
Expand Down Expand Up @@ -3085,6 +3093,8 @@ attributes @target and @cRef may be supplied on <name/>.</report>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#xproc</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#ceteicean</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#imagemarkuptool</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#tustep</value>
Expand Down Expand Up @@ -3153,6 +3163,8 @@ attributes @target and @cRef may be supplied on <name/>.</report>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#zenodo</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
<value>#parlaclarinscripts</value>
<a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
</choice>
<data type="anyURI">
<param name="pattern">http.+|#.+|@.+|hdl.+|mailto.+</param>
Expand Down
Loading

0 comments on commit bb8b822

Please sign in to comment.