Commit 772e3ed8 authored by Fethi Bougares's avatar Fethi Bougares
Browse files

add corpus NER

parent 8137148d
\relax
\providecommand\hyper@newdestlabel[2]{}
\catcode `:\active
\catcode `;\active
\catcode `!\active
\catcode `?\active
\providecommand\HyperFirstAtBeginDocument{\AtBeginDocument}
\HyperFirstAtBeginDocument{\ifx\hyper@anchor\@undefined
\global\let\oldcontentsline\contentsline
\gdef\contentsline#1#2#3#4{\oldcontentsline{#1}{#2}{#3}}
\global\let\oldnewlabel\newlabel
\gdef\newlabel#1#2{\newlabelxx{#1}#2}
\gdef\newlabelxx#1#2#3#4#5#6{\oldnewlabel{#1}{{#2}{#3}}}
\AtEndDocument{\ifx\hyper@anchor\@undefined
\let\contentsline\oldcontentsline
\let\newlabel\oldnewlabel
\fi}
\fi}
\global\let\hyper@last\relax
\gdef\HyperFirstAtBeginDocument#1{#1}
\providecommand\HyField@AuxAddToFields[1]{}
\providecommand\HyField@AuxAddToCoFields[2]{}
\babel@aux{french}{}
\@writefile{toc}{\contentsline {section}{\numberline {1}Project Description}{1}}
\@writefile{toc}{\contentsline {section}{\numberline {2}Data Description }{1}}
\@writefile{toc}{\contentsline {section}{\numberline {3}Evaluation}{2}}
\@writefile{toc}{\contentsline {section}{\numberline {4}Project Roadmap}{2}}
\@writefile{toc}{\contentsline {section}{\numberline {1}Project Description}{1}{section.1}}
\@writefile{toc}{\contentsline {section}{\numberline {2}Data Description }{1}{section.2}}
\@writefile{toc}{\contentsline {section}{\numberline {3}Evaluation}{2}{section.3}}
\@writefile{toc}{\contentsline {section}{\numberline {4}Project Roadmap}{2}{section.4}}
This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=pdflatex 2018.10.12) 25 OCT 2018 17:46
This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=pdflatex 2018.10.12) 30 OCT 2018 14:31
entering extended mode
restricted \write18 enabled.
%&-line parsing enabled.
......@@ -453,6 +453,117 @@ Package babel Info: Making ? an active character on input line 414.
\FBfnindent=\skip47
))
(/usr/share/texlive/texmf-dist/tex/latex/carlisle/scalefnt.sty)
(/usr/share/texlive/texmf-dist/tex/latex/hyperref/hyperref.sty
Package: hyperref 2018/02/06 v6.86b Hypertext links for LaTeX
(/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-hyperref.sty
Package: hobsub-hyperref 2016/05/16 v1.14 Bundle oberdiek, subset hyperref (HO)
(/usr/share/texlive/texmf-dist/tex/generic/oberdiek/hobsub-generic.sty
Package: hobsub-generic 2016/05/16 v1.14 Bundle oberdiek, subset generic (HO)
Package: hobsub 2016/05/16 v1.14 Construct package bundles (HO)
Package: infwarerr 2016/05/16 v1.4 Providing info/warning/error messages (HO)
Package: ltxcmds 2016/05/16 v1.23 LaTeX kernel commands for general use (HO)
Package: ifluatex 2016/05/16 v1.4 Provides the ifluatex switch (HO)
Package ifluatex Info: LuaTeX not detected.
Package hobsub Info: Skipping package `ifvtex' (already loaded).
Package: intcalc 2016/05/16 v1.2 Expandable calculations with integers (HO)
Package hobsub Info: Skipping package `ifpdf' (already loaded).
Package: etexcmds 2016/05/16 v1.6 Avoid name clashes with e-TeX commands (HO)
Package etexcmds Info: Could not find \expanded.
(etexcmds) That can mean that you are not using pdfTeX 1.50 or
(etexcmds) that some package has redefined \expanded.
(etexcmds) In the latter case, load this package earlier.
Package: kvsetkeys 2016/05/16 v1.17 Key value parser (HO)
Package: kvdefinekeys 2016/05/16 v1.4 Define keys (HO)
Package: pdftexcmds 2018/01/21 v0.26 Utility functions of pdfTeX for LuaTeX (HO
)
Package pdftexcmds Info: LuaTeX not detected.
Package pdftexcmds Info: \pdf@primitive is available.
Package pdftexcmds Info: \pdf@ifprimitive is available.
Package pdftexcmds Info: \pdfdraftmode found.
Package: pdfescape 2016/05/16 v1.14 Implements pdfTeX's escape features (HO)
Package: bigintcalc 2016/05/16 v1.4 Expandable calculations on big integers (HO
)
Package: bitset 2016/05/16 v1.2 Handle bit-vector datatype (HO)
Package: uniquecounter 2016/05/16 v1.3 Provide unlimited unique counter (HO)
)
Package hobsub Info: Skipping package `hobsub' (already loaded).
Package: letltxmacro 2016/05/16 v1.5 Let assignment for LaTeX macros (HO)
Package: hopatch 2016/05/16 v1.3 Wrapper for package hooks (HO)
Package: xcolor-patch 2016/05/16 xcolor patch
Package: atveryend 2016/05/16 v1.9 Hooks at the very end of document (HO)
Package atveryend Info: \enddocument detected (standard20110627).
Package: atbegshi 2016/06/09 v1.18 At begin shipout hook (HO)
Package: refcount 2016/05/16 v3.5 Data extraction from label references (HO)
Package: hycolor 2016/05/16 v1.8 Color options for hyperref/bookmark (HO)
)
(/usr/share/texlive/texmf-dist/tex/latex/oberdiek/auxhook.sty
Package: auxhook 2016/05/16 v1.4 Hooks for auxiliary files (HO)
)
(/usr/share/texlive/texmf-dist/tex/latex/oberdiek/kvoptions.sty
Package: kvoptions 2016/05/16 v3.12 Key value format for package options (HO)
)
\@linkdim=\dimen114
\Hy@linkcounter=\count100
\Hy@pagecounter=\count101
(/usr/share/texlive/texmf-dist/tex/latex/hyperref/pd1enc.def
File: pd1enc.def 2018/02/06 v6.86b Hyperref: PDFDocEncoding definition (HO)
Now handling font encoding PD1 ...
... no UTF-8 mapping file for font encoding PD1
)
\Hy@SavedSpaceFactor=\count102
(/usr/share/texlive/texmf-dist/tex/latex/latexconfig/hyperref.cfg
File: hyperref.cfg 2002/06/06 v1.2 hyperref configuration of TeXLive
)
Package hyperref Info: Hyper figures OFF on input line 4509.
Package hyperref Info: Link nesting OFF on input line 4514.
Package hyperref Info: Hyper index ON on input line 4517.
Package hyperref Info: Plain pages OFF on input line 4524.
Package hyperref Info: Backreferencing OFF on input line 4529.
Package hyperref Info: Implicit mode ON; LaTeX internals redefined.
Package hyperref Info: Bookmarks ON on input line 4762.
\c@Hy@tempcnt=\count103
(/usr/share/texlive/texmf-dist/tex/latex/url/url.sty
\Urlmuskip=\muskip10
Package: url 2013/09/16 ver 3.4 Verb mode for urls, etc.
)
LaTeX Info: Redefining \url on input line 5115.
\XeTeXLinkMargin=\dimen115
\Fld@menulength=\count104
\Field@Width=\dimen116
\Fld@charsize=\dimen117
Package hyperref Info: Hyper figures OFF on input line 6369.
Package hyperref Info: Link nesting OFF on input line 6374.
Package hyperref Info: Hyper index ON on input line 6377.
Package hyperref Info: backreferencing OFF on input line 6384.
Package hyperref Info: Link coloring OFF on input line 6389.
Package hyperref Info: Link coloring with OCG OFF on input line 6394.
Package hyperref Info: PDF/A mode OFF on input line 6399.
LaTeX Info: Redefining \ref on input line 6439.
LaTeX Info: Redefining \pageref on input line 6443.
\Hy@abspage=\count105
\c@Item=\count106
\c@Hfootnote=\count107
)
Package hyperref Info: Driver (autodetected): hpdftex.
(/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def
File: hpdftex.def 2018/02/06 v6.86b Hyperref driver for pdfTeX
\Fld@listcount=\count108
\c@bookmark@seq@number=\count109
(/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty
Package: rerunfilecheck 2016/05/16 v1.8 Rerun checks for auxiliary files (HO)
Package uniquecounter Info: New unique counter `rerunfilecheck' on input line 2
82.
)
\Hy@SectionHShift=\skip48
)
(/usr/share/texlive/texmf-dist/tex/latex/dblfloatfix/dblfloatfix.sty
Package: dblfloatfix 2012/12/31 v1.0a (JAW)
......@@ -467,24 +578,26 @@ Package fixltx2e Warning: fixltx2e is not required with releases after 2015
Already applied: [0000/00/00] Old fixltx2e package on input line 53.
)
\@dblbotnum=\count100
\c@dblbotnumber=\count101
\@dblbotnum=\count110
\c@dblbotnumber=\count111
) (./sujet1_NER.aux)
\openout1 = `sujet1_NER.aux'.
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 8.
LaTeX Font Info: ... okay on input line 8.
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 8.
LaTeX Font Info: ... okay on input line 8.
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 8.
LaTeX Font Info: ... okay on input line 8.
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 8.
LaTeX Font Info: ... okay on input line 8.
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 8.
LaTeX Font Info: ... okay on input line 8.
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 8.
LaTeX Font Info: ... okay on input line 8.
LaTeX Font Info: Try loading font information for T1+lmr on input line 8.
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 9.
LaTeX Font Info: ... okay on input line 9.
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 9.
LaTeX Font Info: ... okay on input line 9.
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 9.
LaTeX Font Info: ... okay on input line 9.
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 9.
LaTeX Font Info: ... okay on input line 9.
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 9.
LaTeX Font Info: ... okay on input line 9.
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 9.
LaTeX Font Info: ... okay on input line 9.
LaTeX Font Info: Checking defaults for PD1/pdf/m/n on input line 9.
LaTeX Font Info: ... okay on input line 9.
LaTeX Font Info: Try loading font information for T1+lmr on input line 9.
(/usr/share/texmf/tex/latex/lm/t1lmr.fd
File: t1lmr.fd 2009/10/30 v1.6 Font defs for Latin Modern
)
......@@ -522,37 +635,102 @@ File: t1lmr.fd 2009/10/30 v1.6 Font defs for Latin Modern
* \@reversemarginfalse
* (1in=72.27pt=25.4mm, 1cm=28.453pt)
LaTeX Info: Redefining \degres on input line 8.
LaTeX Info: Redefining \dots on input line 8.
LaTeX Info: Redefining \up on input line 8.
LaTeX Info: Redefining \degres on input line 9.
LaTeX Info: Redefining \dots on input line 9.
LaTeX Info: Redefining \up on input line 9.
\AtBeginShipoutBox=\box26
Package hyperref Info: Link coloring OFF on input line 9.
(/usr/share/texlive/texmf-dist/tex/latex/hyperref/nameref.sty
Package: nameref 2016/05/21 v2.44 Cross-referencing by name of section
Underfull \hbox (badness 10000) in paragraph at lines 46--47
(/usr/share/texlive/texmf-dist/tex/generic/oberdiek/gettitlestring.sty
Package: gettitlestring 2016/05/16 v1.5 Cleanup title references (HO)
)
\c@section@level=\count112
)
LaTeX Info: Redefining \ref on input line 9.
LaTeX Info: Redefining \pageref on input line 9.
LaTeX Info: Redefining \nameref on input line 9.
(./sujet1_NER.out) (./sujet1_NER.out)
\@outlinefile=\write3
\openout3 = `sujet1_NER.out'.
Underfull \hbox (badness 10000) in paragraph at lines 47--48
[]
Underfull \hbox (badness 10000) in paragraph at lines 50--54
Underfull \hbox (badness 10000) in paragraph at lines 49--50
[]
LaTeX Font Info: Try loading font information for T1+lmtt on input line 53.
(/usr/share/texmf/tex/latex/lm/t1lmtt.fd
File: t1lmtt.fd 2009/10/30 v1.6 Font defs for Latin Modern
)
LaTeX Font Info: Try loading font information for OT1+lmr on input line 53.
(/usr/share/texmf/tex/latex/lm/ot1lmr.fd
File: ot1lmr.fd 2009/10/30 v1.6 Font defs for Latin Modern
)
LaTeX Font Info: Try loading font information for OML+lmm on input line 53.
(/usr/share/texmf/tex/latex/lm/omllmm.fd
File: omllmm.fd 2009/10/30 v1.6 Font defs for Latin Modern
)
LaTeX Font Info: Try loading font information for OMS+lmsy on input line 53.
(/usr/share/texmf/tex/latex/lm/omslmsy.fd
File: omslmsy.fd 2009/10/30 v1.6 Font defs for Latin Modern
)
LaTeX Font Info: Try loading font information for OMX+lmex on input line 53.
(/usr/share/texmf/tex/latex/lm/omxlmex.fd
File: omxlmex.fd 2009/10/30 v1.6 Font defs for Latin Modern
)
LaTeX Font Info: External font `lmex10' loaded for size
(Font) <10> on input line 53.
LaTeX Font Info: External font `lmex10' loaded for size
(Font) <7> on input line 53.
LaTeX Font Info: External font `lmex10' loaded for size
(Font) <5> on input line 53.
Underfull \hbox (badness 10000) in paragraph at lines 51--58
[]
[1
{/var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map}] [2] (./sujet1_NER.aux) )
{/var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map}]
Package atveryend Info: Empty hook `BeforeClearDocument' on input line 82.
[2]
Package atveryend Info: Empty hook `AfterLastShipout' on input line 82.
(./sujet1_NER.aux)
Package atveryend Info: Executing hook `AtVeryEndDocument' on input line 82.
Package atveryend Info: Executing hook `AtEndAfterFileList' on input line 82.
Package rerunfilecheck Info: File `sujet1_NER.out' has not changed.
(rerunfilecheck) Checksum: EEBC0FA372FE78D08AD950EB40AF5DA5;201.
Package atveryend Info: Empty hook `AtVeryVeryEnd' on input line 82.
)
Here is how much of TeX's memory you used:
2549 strings out of 492982
33427 string characters out of 6134895
107465 words of memory out of 5000000
6122 multiletter control sequences out of 15000+600000
19403 words of font info for 20 fonts, out of 8000000 for 9000
6558 strings out of 492982
95403 string characters out of 6134895
193220 words of memory out of 5000000
10047 multiletter control sequences out of 15000+600000
30732 words of font info for 31 fonts, out of 8000000 for 9000
1141 hyphenation exceptions out of 8191
29i,4n,30p,696b,423s stack positions out of 5000i,500n,10000p,200000b,80000s
{/
usr/share/texmf/fonts/enc/dvips/lm/lm-ec.enc}</usr/share/texmf/fonts/type1/publ
ic/lm/lmbx12.pfb></usr/share/texmf/fonts/type1/public/lm/lmr10.pfb>
Output written on sujet1_NER.pdf (2 pages, 61304 bytes).
{/usr/share/texmf/fonts/enc/dvips/lm/lm-ec.enc}</usr/share/texmf/fonts/type1/
public/lm/lmbx12.pfb></usr/share/texmf/fonts/type1/public/lm/lmr10.pfb></usr/sh
are/texmf/fonts/type1/public/lm/lmtt10.pfb>
Output written on sujet1_NER.pdf (2 pages, 84918 bytes).
PDF statistics:
20 PDF objects out of 1000 (max. 8388607)
13 compressed objects within 1 object stream
0 named destinations out of 1000 (max. 500000)
1 words of extra memory for PDF output out of 10000 (max. 10000000)
62 PDF objects out of 1000 (max. 8388607)
54 compressed objects within 1 object stream
16 named destinations out of 1000 (max. 500000)
33 words of extra memory for PDF output out of 10000 (max. 10000000)
No preview for this file type
......@@ -4,6 +4,7 @@
\usepackage{lmodern}
\usepackage[a4paper]{geometry}
\usepackage{babel}
\usepackage{hyperref}
\usepackage{dblfloatfix}
\begin{document}
......@@ -45,12 +46,15 @@ We will concentrate on four types of named entities: persons, locations, organiz
We will use the CoNLL-2003 shared task data files. These files contain four columns separated by a single space. Each word has been put on a separate line and there is an empty line after each sentence. The first item on each line is a word, the second a part-of-speech (POS) tag, the third a syntactic chunk tag and the fourth the named entity tag. The chunk tags and the named entity tags have the format I-TYPE which means that the word is inside a phrase of type TYPE. Only if two phrases of the same type immediately follow each other, the first word of the second phrase will have tag B-TYPE to show that it starts a new phrase. A word with tag O is not part of a phrase.\\
The data consists of three files : one training file and two test files testa and testb. The first test file will be used in the development phase for finding good parameters for the learning system. The second test file will be used for the final evaluation.
The data consists of three files : one training file (about 15k sentences) and two test files testa and testb. The first test file will be used in the development phase for finding good parameters for the learning system. The second test file will be used for the final evaluation.\\
Data are available at :\\
train : \\
testa : \\
testb : \\
\vspace{-1mm}
\hspace{2cm} train : \url{http://perso.univ-lemans.fr/\~ fbouga/eng.train}\\
\vspace{-1mm}
\hspace{2cm} testa : \url{http://perso.univ-lemans.fr/\~ fbouga/eng.testa} \\
\vspace{-1mm}
\hspace{2cm} testb : \url{http://perso.univ-lemans.fr/\~ fbouga/eng.testb} \\
\newpage
\section{Evaluation}
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment