\documentclass[12pt]{article} %\documentstyle[12pt,newexam,epsf]{article} %\documentstyle[myexam]{article} %\usepackage{graphicx} \usepackage{newexam+shield} \usepackage{epsf} \input{abbrev} %\input{matdec} \pressmark{COM3110} \department{\bf {DEPARTMENT OF COMPUTER SCIENCE}} \examtitle{{\bf Text Processing}} \examdate{{\bf Autumn Semester 2001-02}} \examtime{{\bf 2 hours}} \rubric{{\bf Answer THREE questions. All questions carry equal weight. Figures in square brackets indicate the percentage of available marks allocated to each part of a question.}} %%%% local macros %%%% \newcommand{\seq}[2]{\mbox{$#1_{1},\- \ldots ,\-#1_{#2}$}} \newtheorem{prop}{Proposition}[questionnumber] \newenvironment{proof}{\begin{trivlist}\item[\hskip \labelsep {\bf Proof}]}{\nopagebreak \rule{1mm}{3mm}\end{trivlist}} \newcommand{\es}{\mbox{$\lambda$}} \newcommand{\mrewrites}{\mbox{$\stackrel{*}{\Rightarrow}$}} \newcommand{\psrule}{\mbox{$\rightarrow$}} %\newcommand{\psrule}[2]{\mbox{{\rm #1}\ \ $\rightarrow$\ \ {\rm #2}}} \newcommand{\synsemrule}[3]{% \mbox{{\rm #1}\ \ $\rightarrow$\ \ {\rm #2}\ :\ ${\it #3}$}} \newcommand{\scbrl}{\left[\hspace*{-0.12em}\left[} \newcommand{\scbrr}{\right]\hspace*{-0.12em}\right]} %%%%%%%%%%%%%%%%%%%%%% \begin{document} \begin{exam} %\hspace*{\fill}QUESTION CONTINUED ON NEXT PAGE %\turnover %\continued %\questionsend %\begin{question} % \begin{qupart} % \begin{exlist} % \exitem % \mypercent{10} % \exitem % \mypercent{20} % \end{exlist} % \end{qupart} %\end{question} % Character encoding, compression and text markup \begin{question} % Character Encoding + Unicode \begin{qupart} Schemes for electronic encoding of text which support all human languages and permit interoperability of software on a global scale require careful analysis of underlying issues in text representation and design of appropriate standards. \begin{exlist} \exitem Explain each of the following terms and make clear the relations between them: {\bf language}, {\bf script}, {\bf character}, {\bf glyph}, {\bf font}. Give examples of each. \mypercent{20} %\exitem Explain the overall goals and high level design principles %underlying Unicode. %What distinguishes Unicode from earlier character coding schemes? %\mypercent{20} \exitem Describe the Unicode coding model, making clear the levels in the model, explaining the differences between and the motivations for UTF-8 and UTF-16, and describing the purpose and implementation of surrogate pairs. \mypercent{30} \end{exlist} \end{qupart} % Text Compression % \begin{qupart} % Text compression techniques are important because growth in volume % of text continually threatens to outstrip increases in storage, % bandwidth and processing capacity. Briefly explain the differences between: % \begin{exlist} % \exitem {\bf symbolwise} (or % statistical) and {\bf dictionary} text compression methods; % \mypercent{10} % \exitem {\bf static}, {\bf semi-static} and {\bf adaptive} % models for text compression; % \mypercent{10} % \exitem {\bf Huffman coding} and {\bf arithmetic coding} methods % for text compression. % \mypercent{10} % \end{exlist} % \end{qupart} % SGML + Markup \begin{qupart} Information about documents is frequently stored in the document itself using embedded annotations called markup''. \begin{exlist} \exitem What is the difference between a markup metalanguage and a markup language? Give at least two examples of each. \mypercent{10} \exitem What is a DTD? Propose a simple SGML DTD for the abstract of a journal article. It should require that the abstract contain one or more author names, a title, a journal name, volume number, issue number, and page numbers and the text of the abstract. Each author should have associated with them an affiliation (e.g. their university). Give a simple example of a fictitious abstract marked up using your DTD. \mypercent{40} \end{exlist} \end{qupart} \end{question} \turnover % Perl \begin{question} % Data Types + Data Structures (References) \begin{qupart} Perl has three basic data types. \begin{exlist} \exitem What are these three types? Give an example of each and indicate how the type of data stored in a variable is conveyed syntactically by the name of the variable. \mypercent{10} %\exitem With these basic types more complex data structures may %be built using references. Explain what references are in Perl, %and how they are created and used. %\mypercent{10} \exitem With these basic types more complex data structures may be built using references. Describe a data structure that would be appropriate to hold contact details for a set of persons -- for each person a telephone number, email address and fax number is to be held. %This should be done in a fashion that permits retrieval of, e.g. %a fax number given the person's name and the keyword {\tt fax}. \medskip Give Perl code that adds a new person's data to the structure -- you may assume each piece of data (person name, telephone number, email address, fax number) is already held in a distinct named variable. Also give code that extracts a data element (e.g. fax number) into a variable given the data structure, a person's name, and a data element keyword (e.g. {\tt fax}). Explain how your code works. % Suppose you have a text file containing lines % of the form: % \begin{verbatim} % Phil Jones tel: 766 8970 email: phil@someplace.com fax: 766 8975 % Tom Smith email: phil@someplace.com tel: 236 5770 fax: 766 8975 % \end{verbatim} \mypercent{30} \end{exlist} \end{qupart} % Subroutines + Scoping? \begin{qupart} Explain the difference between lexical and dynamic scoping of variables in Perl, and the role of the {\tt my}, {\tt local} and {\tt our} declarations. \mypercent{20} \end{qupart} % Regular Expressions \begin{qupart} Regular expressions provide a very expressive language for pattern matching in strings. \begin{exlist} \exitem Explain the difference between metacharacters and metasymbols in Perl regular expressions and give at least two examples of each. \mypercent{10} \exitem Write a Perl regular expression which will match HTML anchor tags and capture the value of the {\tt HREF} attribute and the contents of the anchor tag itself. \medskip For example, suppose the following assignment has been made in a Perl program: \begin{verbatim} $s = "CPAN"; \end{verbatim} Your regular expression should match such strings and capture the substrings \verb+"http://www.cpan.org>"+ and \verb+"CPAN"+. %{\tt$anchor_href} Explain how your regular expression works. \mypercent{30} \end{exlist} \end{qupart} % Packages, Modules + OO % \begin{qupart} % % \mypercent{20} % \end{qupart} \end{question} %\continued \questionsend \end{exam} \end{document} \ No newline at end of file