Carol Neidle, Using linguistic modeling and linguistically annotated corpora for computer-based sign language recognition from video. Invited presentation at the ChaLearn Looking at People Workshop 2014, to be held in conjunction with ECCV 2014 at the Swiss Federal Institute of Technology (ETH) Zurich, Switzerland. September 6-7. 2014.
This talk will report on recent and ongoing collaborative, crossdisciplinary research on computer-based recognition of American Sign Language (ASL) from monocular video, with attention to both the manual and non-manual components.
New 3D tracking and computational learning methods have been employed, in conjunction with linguistic modeling and linguistically annotated video corpora, for the automated detection and identification of essential linguistic information expressed through head motion and facial expressions that extend over varying phrasal domains (in work with D. Metaxas, B. Liu, J. Liu, X. Peng, and others). This modeling is also being applied to computer-based sign language generation, to create more lingistically realistic signing avatars (in joint work with M. Huenerfauth).
In collaborative work with computer scientists at Boston University (S. Sclaroff, A. Thangali) and Rutgers University (D. Metaxas, M. Dilsizian, and others), various approaches have been taken to increase the accuracy of recognition of hand configurations (an essential parameter in the formation of manual signs). These approaches incorporate exploitation of statistics from our annotated American Sign Language Lexicon Video Data set (ASLLVD) — containing nearly 10,000 examples of isolated signs — that reflect linguistic constraints and dependencies between the start and end hand configurations within a given "lexical" sign, and between the hand configurations on the dominant and non-dominant hand in 2-handed signs.
The videos, linguistic annotations, as well as some of the results and data visualizations from the computational analyses, are being made publicly accessible through our Web interface (DAI: Data Access Interface; implementation by C. Vogler) that facilitates browsing, searching, viewing, and downloading subsets of the available data.
Carol Neidle, Jingjing Liu, Bo Liu, Xi Peng, Christian Vogler, and Dimitris Metaxas, Computer-based Tracking, Analysis, and Visualization of Linguistically
Significant Nonmanual Events in American Sign Language (ASL). Presented at the 6th Workshop on Representation and Processing of Sign Languages: Beyond the Manual Channel. LREC 2014, Reykjavik, Iceland, May 31, 2014.
Our linguistically annotated American Sign Language (ASL) corpora have formed a basis for research to automate detection by computer of essential linguistic information conveyed through facial expressions and head movements. We have tracked head position and facial deformations, and used computational learning to discern specific grammatical markings. Our ability to detect, identify, and temporally localize the occurrence of such markings in ASL videos has recently been improved by incorporation of (1) new techniques for deformable model-based 3D tracking of head position and facial expressions, which provide significantly better tracking accuracy and recover quickly from temporary loss of track due to occlusion; and (2) a computational learning approach incorporating 2-level Conditional Random Fields (CRFs), suited to the multi-scale spatio-temporal characteristics of the data, which analyses not only low-level appearance characteristics, but also the patterns that enable identification of significant gestural components, such as periodic head movements and raised or lowered eyebrows. Here we summarize our linguistically motivated computational approach and the results for detection and recognition of nonmanual grammatical markings; demonstrate our data visualizations, and discuss the relevance for linguistic research; and describe work underway to enable such visualizations to be produced over large corpora and shared publicly on the Web.
Bo Liu, Jingjing Liu, Xiang Yu, Dimitris Metaxas and Carol Neidle, 3D Face Tracking and Multi-Scale, Spatio-temporal Analysis of Linguistically Significant Facial Expressions and Head Positions in ASL. Presented at LREC 2014, Reykjavik, Iceland, May 30, 2014.
Essential grammatical information is conveyed in signed languages by clusters of events involving facial expressions and movements of the head and upper body. This poses a significant challenge for computer-based sign language recognition. Here, we present new methods for the recognition of nonmanual grammatical markers in American Sign Language (ASL) based on: (1) new 3D tracking methods for the estimation of 3D head pose and facial expressions to determine the relevant low-level features; (2) methods for higher-level analysis of component events (raised/lowered eyebrows, periodic head nods and head shakes) used in grammatical markings―with differentiation of temporal phases (onset, core, offset, where appropriate), analysis of their characteristic properties, and extraction of corresponding features; (3) a 2-level learning framework to combine low- and high-level features of differing spatio-temporal scales. This new approach achieves significantly better tracking and recognition results than our previous methods.
Mark Dilsizian, Polina Yanovich, Shu Wang, Carol Neidle and Dimitris Metaxas, A New Framework for Sign Language Recognition based on 3D Handshape Identification and Linguistic Modeling. Presented at LREC 2014, Reykjavik, Iceland, May 29, 2014.
Current approaches to sign recognition by computer generally have at least some of the following limitations: they rely on laboratory conditions for sign production, are limited to a small vocabulary, rely on 2D modeling (and therefore cannot deal with occlusions and off-plane rotations), and/or achieve limited success. Here we propose a new framework that (1) provides a new tracking method less dependent than others on laboratory conditions and able to deal with variations in background and skin regions (such as the face, forearms, or other hands); (2) allows for identification of 3D hand configurations that are linguistically important in American Sign Language (ASL); and (3) incorporates statistical information reflecting linguistic constraints in sign production. For purposes of large-scale computer-based sign language recognition from video, the ability to distinguish hand configurations accurately is critical. Our current method estimates the 3D hand configuration to distinguish among 77 hand configurations linguistically relevant for ASL. Constraining the problem in this way makes recognition of 3D hand configuration more tractable and provides the information specifically needed for sign recognition. Further improvements are obtained by incorporation of statistical information about linguistic dependencies among handshapes within a sign derived from an annotated corpus of almost 10,000 sign tokens.
Carol Neidle, Two invited presentations at Syracuse University, Syracuse, NY,
April 22, 2014:
- American Sign Language (ASL) and its Standing in the Academy
A recent survey by the Modern Language Association (MLA) of foreign language enrollments in the United States showed a dramatic increase in American Sign Language (ASL) enrollment since 1990. By 2009 ASL ranked as the 4th most studied foreign language in the United States, behind Spanish, French, and German. American universities have, in increasing numbers, come to accept ASL in satisfaction of their foreign language requirements. In this presentation, I will discuss some of the considerations relevant to such policy decisions, focusing on the linguistic properties of ASL, but also touching upon its cultural context and literary art forms. The study of American Sign Language, in part through the large collection of video materials now accessible to students, expands their horizons. It provides a valuable perspective on the nature of human language and reveals rich cultural and literary traditions generally unfamiliar to those outside of the Deaf community. It has the added benefit of enabling communication with those in the US and parts of Canada who use ASL as their primary language; as well as with Deaf people from elsewhere who have also learned ASL.
- Crossdisciplinary Approaches to Sign Language Research: Video Corpora for Linguistic Analysis and Computer-based Recognition of American Sign Language (ASL)
Carol Neidle, The American Sign Language Linguistic Research Project. NSF Avatar and Robotics Signing Creatures Workshop, Gallaudet University, Washington, DC, November 15, 2013.
Carol Neidle, Crossdisciplinary Approaches to American Sign Language Research: Linguistically Annotated Video Corpora for Linguistic Analysis and Computer-based Sign Language Recognition/Generation. CUNY Linguistics Colloquium, October 10, 2013.
After a brief introduction to the linguistic organization of American Sign Language (ASL), this talk presents an overview of collaborations between linguists and computer scientists aimed at advancing sign language linguistics and computer-based sign language recognition (and generation). Underpinning this research are expanding, linguistically annotated, video corpora containing multiple synchronized views of productions of native users of ASL. These materials are being shared publicly through a Web interface, currently under development, that facilitates browsing, searching, viewing, and downloading subsets of the data.
Two sub-projects are highlighted:
(1) Linguistic modeling used to enhance computer vision-based recognition of manual signs. Statistics emerging from an annotated corpus of about 10,000 citation-form sign productions by six native signers make it possible to leverage linguistic constraints to make sign recognition more robust.
(2) Recognition of grammatical information expressed through complex combinations of facial expressions and head gestures -- marking such things as topic/focus, distinct types of questions, negation, if/when clauses, relative clauses, etc. -- based on state-of-the-art face and head tracking combined with machine learning techniques. This modeling is also being applied to creation of more natural and linguistically realistic signing avatars. Furthermore, the ability to provide computer-generated graphs illustrating, for large data sets, changes in eyebrow height, eye aperture, and head position (e.g.) over time, in relation to the start and end points of the manual signs in the phrases with which non-manual gestures co-occur, opens up new possibilities for linguistic analysis of the nonmanual components of sign language grammar and for crossmodal comparisons.
The research reported here has resulted from collaborations with many people, including Stan Sclaroff and Ashwin Thangali (BU), Dimitris Metaxas, Mark Dilsizian, Bo Liu, and Jingjing Liu (Rutgers), Ben Bahan and Christian Vogler (Gallaudet), and Matt Huenerfauth (CUNY Queens College) and has been made possible by funding from the National Science Foundation.
Jingjing Liu, Bo Liu, Shaoting Zhang, Fei Yang, Peng Yang, Dimitris N. Metaxas and Carol Neidle, Recognizing Eyebrow and Periodic Head Gestures using CRFs for Non-Manual Grammatical Marker Detection in ASL. Presented at a Special Session on Sign Language, FG 2013: 10th IEEE International Conference on Automatic Face and Gesture Recognition, Shanghai, China, April 25, 2013.
Changes in eyebrow configuration, in combination with head gestures and other facial expressions, are used to signal essential grammatical information in signed languages. Motivated by the goal of improving the detection of non-manual grammatical markings in American Sign Language (ASL), we introduce a 2-level CRF method for recognition of the components of eyebrow and periodic head gestures, differentiating the linguistically significant domain (core) from transitional movements (which we refer to as the onset and offset). We use a robust face tracker and 3D warping to extract and combine the geometric and appearance features, as well as a feature selection method to further improve the recognition accuracy. For the second level of the CRFs, linguistic annotations were used as training for partitioning of the gestures, to separate the onset and offset. This partitioning is essential to recognition of the linguistically significant domains (in between). We then use the recognition of onset, core, and offset of these gestures together with the lower level features to detect non-manual grammatical markers in ASL.
Bo Liu, Dimitrios Kosmopoulos, Mark Dilsizian, Peng Yang, Carol Neidle, and Dimitris Metaxas, Detection and Classification of Non-manual Grammatical Markers in American Sign Language (ASL). Poster presented at the 2nd Multimedia and Vision Meeting in the Greater New York Area, Columbia University, New York City, NY, June 15, 2012.
Carol Neidle and Christian Vogler, A New Web Interface to Facilitate Access to Corpora: Development of the ASLLRP Data Access Interface, 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, LREC 2012, Istanbul, Turkey, May 27, 2012. <link to workshop proceedings>
A significant obstacle to broad utilization of corpora is the difficulty in gaining access to the specific subsets of data and annotations that may be relevant for particular types of research. With that in mind, we have developed a web-based Data Access Interface (DAI), to provide access to the expanding datasets of the American Sign Language Linguistic Research Project (ASLLRP). The DAI facilitates browsing the corpora, viewing videos and annotations, searching for phenomena of interest, and downloading selected materials from the website. The web interface, compared to providing videos and annotation files off-line, also greatly increases access by people that have no prior experience in working with linguistic annotation tools, and it opens the door to integrating the data with third-party applications on the desktop and in the mobile space. In this paper we give an overview of the available videos, annotations, and search functionality of the DAI, as well as plans for future enhancements. We also summarize best practices and key lessons learned that are crucial to the success of similar projects.
Carol Neidle, Ashwin Thangali and Stan Sclaroff, Challenges in Development of the American Sign Language Lexicon Video Dataset (ASLLVD) Corpus, 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, LREC 2012, Istanbul, Turkey, May 27, 2012. <link to workshop proceedings>
The American Sign Language Lexicon Video Dataset (ASLLVD) consists of videos of >3,300 ASL signs in citation form, each produced by 1-6 native ASL signers, for a total of almost 9,800 tokens. This dataset, including multiple synchronized videos showing the signing from different angles, will be shared publicly once the linguistic annotations and verifications are complete. Linguistic annotations include gloss labels, sign start and end time codes, start and end handshape labels for both hands, morphological and articulatory classifications of sign type. For compound signs, the dataset includes annotations for each morpheme. To facilitate computer vision-based sign language recognition, the dataset also includes numeric ID labels for sign variants, video sequences in uncompressed-raw format, camera calibration sequences, and software for skin region extraction. We discuss here some of the challenges involved in the linguistic annotations and categorizations. We also report an example computer vision application that leverages the ASLLVD: the formulation employs a HandShapes Bayesian Network (HSBN), which models the transition probabilities between start and end handshapes in monomorphemic lexical signs. Further details and statistics for the ASLLVD dataset, as well as information about annotation conventions, are available from http://www.bu.edu/asllrp/lexicon.
Dimitris Metaxas, Bo Liu, Fei Yang, Peng Yang, Nicholas Michael, and Carol Neidle, Recognition of Nonmanual Markers in American Sign Language (ASL) Using Non-Parametric Adaptive 2D-3D Face Tracking, LREC 2012, Istanbul, Turkey, May 24, 2012. <link to LREC proceedings>
This paper addresses the problem of automatically recognizing linguistically significant nonmanual expressions in American Sign Language from video. We develop a fully automatic system
that is able to track facial expressions and head movements, and detect and recognize facial events continuously from video. The main contributions of the proposed framework are the following: (1) We have built a stochastic and adaptive ensemble of face trackers to address factors resulting in lost face track; (2) We combine 2D and 3D deformable face models to warp input frames, thus correcting for any variation in facial appearance resulting from changes in 3D head pose; (3) We use a combination of geometric features and texture features extracted from a canonical frontal representation. The proposed new framework makes it possible to detect grammatically significant nonmanual expressions from continuous signing and to differentiate successfully among linguistically significant expressions that involve subtle differences in appearance. We present results that are based on the use of a dataset containing 330 sentences from videos that were collected and linguistically annotated at Boston University.
Zoya Gavrilova, Stan Sclaroff, Carol Neidle, and Sven Dickinson, Detecting Reduplication in Videos of American Sign Language, LREC 2012, Istanbul, Turkey, May 25, 2012. <link to LREC proceedings>
A framework is proposed for the detection of reduplication in digital videos of American Sign Language (ASL). In ASL, reduplication is used for a variety of linguistic purposes, including overt marking of plurality on nouns, aspectual inflection on verbs, and nominalization of verbal forms. Reduplication involves the repetition, often partial, of the articulation of a sign. In this paper, the apriori algorithm for mining frequent patterns in data streams is adapted for finding reduplication in videos of ASL. The proposed algorithm can account for varying weights on items in the apriori algorithm’s input sequence. In addition, the apriori algorithm is extended to allow for inexact matching of similar hand motion subsequences and to provide robustness to noise. The formulation is evaluated on 105 lexical signs produced by two native signers. To demonstrate the formulation, overall hand motion direction and magnitude are considered; however, the formulation should be amenable to combining these features with others, such as hand shape, orientation, and place of articulation.
Carol Neidle and Stan Sclaroff, A Demonstration System For Video-Based Sign Language Retrieval. Presented at the Rafik B. Hariri Institute for Computing and Computational Science & Engineering, Boston University, May 15, 2012. Carol Neidle, Conditional Constructions in ASL: Signs of Irrealis and Hypotheticality. Presented at the Institut Jean‐Nicod, CNRS, in Paris, France, April 4, 2012. Nicholas Michael, Peng Yang, Dimitris Metaxas, and Carol Neidle, A Framework for the Recognition of Nonmanual Markers in Segmented Sequences of American Sign Language, British Machine Vision Conference 2011, Dundee, Scotland, August 31, 2011.
Ashwin Thangali, Joan P. Nash, Stan Sclaroff and Carol Neidle, Exploiting Phonological Constraints for Handshape Inference in ASL Video, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2011.
Handshape is a key linguistic component of signs, and thus, handshape recognition is essential to algorithms for sign language recognition and retrieval. In this work, linguistic constraints on the relationship between start and end handshapes are leveraged to improve handshape recognition accuracy. A Bayesian network formulation is proposed for learning and exploiting these constraints, while taking into consideration inter-signer variations in the production of particular handshapes. A Variational Bayes formulation is employed for supervised learning of the model parameters. A non-rigid image alignment algorithm, which yields improved robustness to variability in handshape appearance, is proposed for computing image observation likelihoods in the model. The resulting handshape inference algorithm is evaluated using a dataset of 1500 lexical signs in American Sign Language (ASL), where each lexical sign is produced by three native ASL signers.
Carol Neidle, with Joan Nash and Christian Vogler, Two Web-Accessible ASL Corpora. Sign language corpus workshop, Gallaudet University, Washington, DC, May 21, 2011.
Haijing Wang, Alexandra Stefan, Sajjad Moradi, Vassilis Athitsos, Carol Neidle, and Farhad Kamangar, A System for Large Vocabulary Sign Search. International Workshop on Sign, Gesture, and Activity (SGA) 2010, in conjunction with ECCV 2010. September 11, 2010. Hersonissos, Heraklion, Crete, Greece.
A method is presented, that helps users look up the meaning of an unknown sign from American Sign Language (ASL). The user submits as a query a video of the unknown sign, and the system retrieves the most similar signs from a database of sign videos. The user then reviews the retrieved videos to identify the video displaying the sign of interest. Hands aredetected in a semi-automatic way: the system performs some hand detection and tracking, andthe user has the option to verify and correct the detected hand locations. Features are extractedbased on hand motion and hand appearance. Similarity between signs is measured bycombining dynamic time warping (DTW) scores, that are based on hand motion, with a simplesimilarity measure based on hand appearance. In user-independent experiments, with a system vocabulary of 1,113 signs, the correct sign was included in the top 10 matches for 78% of thetest queries.
Stan Sclaroff, Vassilis Athitsos, Carol Neidle, Joan Nash, Alexandra Stefan, Ashwin Thangali, Haijing Wang, and Quan Yuan, American Sign Language Lexicon Project: Video Corpus and Indexing/Retrieval Algorithms (poster). International Workshop on Computer Vision (IWCV), Vietri Sul Mare, Salerno, Italy. May 25-27, 2010.
Nicholas Michael, Carol Neidle, Dimitris Metaxas, Computer-based recognition of facial expressions in ASL: from face tracking to linguistic interpretation. 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, LREC 2010, May 22-23, 2010.
Most research in the field of sign language recognition has focused on the manual component of signing, despite the fact that there is critical grammatical information expressed through facial expressions and head gestures. We, therefore, propose a novel framework for robust tracking and analysis of nonmanual behaviors, with an application to sign language recognition. Our method uses computer vision techniques to track facial expressions and head movements from video, in order to recognize such linguistically significant expressions. The methods described here have relied crucially on the use of a linguistically annotated video corpus that is being developed, as the annotated video examples have served for training and testing our machine learning models. We apply our framework to continuous recognition of three classes of grammatical expressions, namely wh-questions, negative expressions, and topics.
Vassilis Athitsos, Carol Neidle, Stan Sclaroff, Joan Nash, Alexandra Stefan, Ashwin Thangali, Haijing Wang, and Quan Yuan, Large Lexicon Project: American Sign Language Video Corpus and Sign Language Indexing/Retrieval Algorithms. 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, LREC 2010, May 22-23, 2010.
When we encounter a word that we do not understand in a written language, we can look it up in a dictionary. However, looking up the meaning of an unknown sign in American Sign Language (ASL) is not nearly as straightforward. This paper describes progress in an ongoing project aiming to build a computer vision system that helps users look up the meaning of an unknown ASL sign. When a user encounters an unknown ASL sign, the user submits a video of that sign as a query to the system. The system evaluates the similarity between the query and video examples of all signs in the known lexicon, and presents the most similar signs to the user. The user can then look at the retrieved signs and determine if any of them matches the query sign.
An important part of the project is building a video database containing examples of a large number of signs. So far we have recorded at least two video examples for almost all of the 3,000 signs contained in the Gallaudet dictionary. Each video sequence is captured simultaneously from four different cameras, providing two frontal views, a side view, and a view zoomed in on the face of the signer. Our entire video dataset is publicly available on the Web.
Automatic computer vision-based evaluation of similarity between signs is a challenging task. In order to improve accuracy, we manually annotate the hand locations in each frame of each sign in the database. While this is a time-consuming process, this process incurs a one-time preprocessing cost that is invisible to the end-user of the system. At runtime, once the user has submitted the query video, the current version of the system asks the user to specify hand locations in the first frame, and then the system automatically tracks the location of the hands in the rest of the query video. The user can review and correct the hand location results. Every correction that the user makes on a specific frame is used by the system to further improve the hand location estimates in other frames.
Once hand locations have been estimated for the query video, the system evaluates the similarity between the query video and every sign video in the database. Similarity is measured using the Dynamic Time Warping (DTW) algorithm, a well-known algorithm for comparing time series. The performance of the system has been evaluated in experiments where 933 signs from 921 distinct sign classes are used as the dataset of known signs, and 193 signs are used as a test set. In those experiments, only a single frontal view was used for all test and training examples. For 68% of the test signs, the correct sign is included in the 20 most similar signs retrieved by the system. In ongoing work, we are manually annotating hand locations in the remainder of our collected videos, so as to gradually incorporate more signs into our system. We are also investigating better ways for measuring similarity between signs, and for making the system more automatic, reducing or eliminating the need for the user to manually provide information to the system about hand locations.
Nicholas Michael, Dimitris Metaxas, and Carol Neidle, Spatial and Temporal Pyramids for Grammatical Expression Recognition of American Sign Language. Eleventh International ACM SIGACCESS Conference on Computers and Accessibility. Philadelphia, PA, October 26-28, 2009. Carol Neidle, Nicholas Michael, Joan Nash, and Dimitris Metaxas, A Method for Recognition of Grammatically Significant Head Movements and Facial Expressions, Developed Through Use of a Linguistically Annotated Video Corpus. Workshop on Formal Approaches to Sign Languages, held as part of the 21st European Summer School in Logic, Language and Information, Bordeaux, France, July 20-31, 2009.
C. Neidle, Crossdisciplinary Corpus-Based ASL Research, 2008-2009 VL2 Presentation Series at Gallaudet University, Washington, DC, March 26, 2009, 4 pm.
This talk will (a) present information about a large, publicly available, linguistically annotated corpus, including high quality video files showing synchronized multiple views (with a close-up of the face) of Deaf native signers, and (b) discuss ways in which these data have been used in our linguistic and computer science collaborations. Projects include development of a sign look-up capability based on live or recorded video input, and recognition of various manual and non-manual properties of signing. This research has been supported by grants from the National Science Foundation (#CNS-0427988, #IIS-0705749).
V. Athitsos, C. Neidle, S. Sclaroff, J. Nash, A. Stefan, Q. Yuan, & A. Thangali, The ASL Lexicon Video Dataset. First IEEE Workshop on CVPR for Human Communicative Behavior Analysis. Anchorage, Alaska, Monday June 28, 2008. Philippe Dreuw, Carol Neidle, Vassilis Athitsos, Stan Sclaroff, and Hermann Ney, Benchmark Databases for Video-Based Automatic Sign Language Recognition, the sixth International Conference on Language Resources and Evaluation (LREC). Morocco, May 2008.
C. Neidle, Sign Language Research: Challenges for Sharing and Dissemination of Video Files and Linguistic Annotations, Preservation and Discovery in the Digital Age (ARL Directors' Meeting), Cambridge, MA, November 14-15, 2007.
Our linguistic research and computer science collaborations, supported by the National Science Foundation, rely on a large annotated corpus of sign language video data, which we wish to share with the linguistic and computer science research communities. The challenges for organizing and storing such data in ways that make it possible for others to identify the contents, search for data of interest to them, and then download the relevant materials will be addressed in this presentation.
G. Tsechpenakis, D. Metaxas, O. Hadjiliadis, and C. Neidle, Robust Online Change-point Detection in Video Sequences, 2nd IEEE Workshop on Vision for Human Computer Interaction (V4HCI) in conjunction with IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06), New York, NY, June 2006. T. Castelli, M. Betke, and C. Neidle, Facial Feature Tracking and Occlusion Recovery in American Sign Language, Pattern Recognition in Information Systems (PRIS-2006) Workshop, Cyprus, May 23-24, 2006. C. Neidle, The crossmodal study of human language and its importance to linguistic theory, Revolutions in Sign Language Studies: Linguistics, Literature, Literacy. Gallaudet University, March 22-24, 2006. G. Tsechpenakis, D. Metaxas, and C. Neidle, Learning-based dynamic coupling of discrete and continuous trackers, in Modeling People and Human Interaction workshop of ICCV 2005, Beijing, China, October 2005. W. He, X. Huang, G. Tsechpenakis, D. Metaxas, and C. Neidle, Discovery of Informative Unlabeled Data for Improved Learning, in Modeling People and Human Interaction workshop of ICCV 2005, Beijing, China, October 2005. C. Neidle and R.G. Lee, Aspects of ASL syntax: CP and the left and right peripheries, guest lecture at the Linguistic Society of America Summer Institute, MIT, July 28, 2005. [See ASLLRP Report no. 12] C. Neidle and D. Metaxas, Linguistically-based computational methods for ASL recognition, SIGNA VOLANT, Sign Language Linguistics and the Application of Information Technology to Sign (SLL&IT), Milan, Italy, June 22-24, 2005. C. Neidle, Another look at some non-manual expressions of syntactic information in ASL, SIGNA VOLANT, Sign Language Linguistics and the Application of Information Technology to Sign (SLL&IT) , Milan, Italy, June 22-24, 2005. C. Neidle, participant in a panel discussion: How analysis shapes data; 2nd Conference of the International Society for Gesture Studies (ISGS), INTERACTING BODIES - CORPS EN INTERACTION, École normale supérieure Lettres et Sciences humaines, Lyon, France, June 15-18, 2005.
D. Metaxas and C. Neidle, Linguistically-Based Computational Methods for ASL Recognition, Rutgers University Center for Cognitive Science (RuCCS) Colloquium, April 26, 2005 at 1 PM.
Signed languages involve simultaneous expression of different types of linguistic information through the manual and non-manual channels. In parallel with the signing of lexical material, primarily by the hands and arms, essential linguistic information is conveyed through facial expressions and gestures of the head and upper body, extending over varying, often phrasal, domains. We begin with an overview of the linguistic use of these channels in American Sign Language (ASL). Our linguistic studies have been accomplished in part through collection of video data from native ASL signers: high quality, synchronized views from multiple angles, including a close-up of the face. The annotations of the linguistically relevant components in the manual and non-manual channels have also been of critical importance in our research on computer-based ASL recognition. In this talk, we present and discuss recent advances in recognition of meaningful facial expressions made possible by the coupling of discrete and continuous tracking methods. The use of these methods to analyze hand motion is also enabling the discrimination of fingerspelled vs. non-fingerspelled signs. Given that these types of signs have very different internal linguistic structures, such discrimination is essential to recognition of manual signs.
This research has been supported by an NSF ITR grant to Rutgers University, Boston University, and Gallaudet University (D. Metaxas, C. Neidle and C. Vogler, PIs).
C. Neidle, Focus on the left and right peripheries in ASL. University of Toronto, February 4, 2005 at 3 PM.
Like other signed languages, ASL makes critical use of non-manual markings (gestures of the face and upper body) to express syntactic information. Quite a few different constructions, including 'if' and 'when' clauses, focused NPs, and 'relative clauses' (really correlatives), are characterized by very similar non-manual markings. Not concidentally, all of these phrases normally occur in a sentence-initial position and share some interesting semantic and syntactic commonalities. It is argued here that this position and grammatical marking are associated with Focus. The proposed analysis further provides an explanation (in terms of Rizzi's Relativized Minimality) for previously puzzling semantic and syntactic differences between wh-questions in which the wh-phrase does or does not undergo movement to the right periphery of the clause.
Considerations of focus also provide the key to understanding the apparent optionality of the non-manual realization of subject agreement. We argue that this non-manual marking, which suffices to license null subjects, in fact functions to mark focus. Thus, for example, the VP may or may not bear this focus marking -- which, when present, invariably includes an overt non-manual expression of subject agreement.
Thus, two puzzling cases of apparent optionality in the syntax of ASL are considered. In both cases, it is argued that focus is the relevant factor differentiating the variants.
C. Neidle, Language in another dimension: the syntax of American Sign Language (ASL). York University, DLLL, Ross S 562, February 3, 2005 at 4 PM.
Comparison of the way in which language is manifested in the visual-gestural and aural-oral modalities offers important insights into the nature of the human language faculty. This talk will provide an overview of the linguistic organization of ASL, focusing on the syntax of the language. In parallel with lexical items, which are articulated primarily by the hands, essential syntactic information is expressed through gestures of the face, head, and upper body. These extend over phrases (rather than individual lexical items) to convey information about, e.g., negation, question status and type, reference, subject and object agreement, mood, tense, aspect, definiteness, specificity, and information status (topic, focus). Video examples illustrating the constructions under discussion, as signed by native ASL signers, will be shown.
Because of the difficulties involved in studying language in the visual modality, the American Sign Language Linguistic Research Project has developed a computer application, SignStream, which has been invaluable in our own syntactic research and in collaborations with computer scientists interested in the problems of sign language recognition. Our annotated video data (multiple synchronized views of the signing, including a close-up of the face) may be of use to other researchers, as well. The presentation will include a brief demonstration of SignStream--a tool that can be applied more generally to the linguistic study of any kind of digital video data. It will conclude with mention of the collaborative research now underway on issues related to the problem of ASL recognition by computer.
C. Neidle, Reflexes of Focus in American Sign Language, Colloquium, Linguistics Department. University of Massachusetts, Amherst, November 12, 2004.
This presentation will begin with a brief overview of our syntactic research on American Sign Language, including mention of available annotated data sets (including high-quality video with multiple synchronized views of native signers) and collaboration with computer scientists interested in the problem of sign language recognition. Further information about the research to be reported on here, sponsored in part by grants from the National Science Foundation, is available from http://www.bu.edu/asllrp/.
Like other signed languages, ASL makes critical use of non-manual markings (gestures of the face and upper body) to express syntactic information. Quite a few different constructions, including 'if' and 'when' clauses, focused NPs, and 'relative clauses' (really correlatives), are characterized by very similar non-manual markings. Not coincidentally, all of these phrases normally occur in a sentence-initial position and share some interesting semantic and syntactic commonalities. It is argued here that this position and grammatical marking are associated with Focus. The proposed analysis further provides an explanation (in terms of Rizzi's Relativized Minimality) for previously puzzling semantic and syntactic differences between wh-questions in which the wh-phrase does or does not undergo movement to the right periphery of the clause.
Considerations of focus also provide the key to understanding the apparent optionality of the non-manual realization of subject agreement. We argue that this non-manual marking, which suffices to license null subjects, in fact functions to mark focus. Thus, for example, the VP may or may not bear this focus marking -- which, when present, includes an overt non-manual expression of subject agreement. Thus, two puzzling cases of apparent optionality in the syntax of ASL are considered. In both cases, it is argued that focus is the relevant factor differentiating the variants.
C. Neidle, Dimensions of linguistic research: Analysis of a signed language (plenary address). Australian Linguistic Society conference 2004, Sydney, Australia, July 13-15, 2004.
In signed languages, crucial information is expressed, in parallel, by manual signs and by facial expressions and movements of the head. Despite some interesting modality-dependent differences, the fundamental properties of spoken and signed languages are strikingly similar. Beginning with an overview of the linguistic organization of American Sign Language (ASL), this presentation will focus on several syntactic constructions in ASL that have been the subject of some controversy. Specifically, the nature and distribution of syntactic agreement and alternative structures for wh-questions will be discussed. In both cases, it will be argued that semantic and pragmatic factors (related to ‘focus’) differentiate seemingly optional syntactic variants.
Because of the difficulties of studying language in the visual modality, the American Sign Language Linguistic Research Project has developed a computer application, SignStream, which has been invaluable in our own syntactic research and in collaborations with computer scientists interested in the problems of sign language recognition. Our annotated video data (including synchronized views showing different perspectives of the signing and a close-up of the face) may be of use to other researchers, as well. The presentation will conclude with a brief demonstration of SignStream (a tool that can be applied more generally to the linguistic study of any kind of digital video) and information about the available annotated corpus of data collected from native users of American Sign Language.
C. Neidle and R.G. Lee, Unification, competition and optimality in signed languages: aspects of the syntax of American Sign Language (ASL). International Lexical Functional Grammar Conference. Christchurch, New Zealand, July 10-12, 2004.
C. Neidle, Resources for the study of visual language data: SignStream software for linguistic annotation and analysis of digital video and an annotated ASL corpus. International Lexical Functional Grammar Conference. Christchurch, New Zealand, July 10-12, 2004.
C. Neidle and R.G. Lee, Dimensions of the Linguistic Analysis of ASL: Challenges for Computer-Based Recognition. HCI Seminar Series, Computer Science and Artificial Intelligence Laboratory, MIT, June 9, 2004.
ASL and other signed languages have only recently been recognized to be full-fledged natural languages worthy of scientific study. These languages make use of complex articulations of the hands in parallel with linguistically significant facial gestures and head movements. Until recently, the lack of sophisticated tools for capture, annotation, retrieval, and analysis of the complex interplay of manual and non-manual elements has held back linguistic research. Thus, although there have been some fascinating discoveries in recent years, additional research on the linguistic structure of signed languages is badly needed. The relatively limited linguistic understanding of signed languages is, in turn, a hindrance for progress in computer-based sign language recognition. For these reasons, crossdisciplinary collaborations are needed in order to achieve advances in both domains.
This talk provides an overview of the nature of linguistic expression in the visual-spatial modality. It includes a demonstration of SignStream, an application developed by our research group for the annotation of video-based language data. SignStream has been used for creation of a substantial corpus of linguistically annotated data from native users of ASL, including high-quality synchronized video files showing the signing from multiple angles as well as a close-up view of the face. These video files and associated annotations, which have been used by many linguists and computer scientists (working separately and together), are being made publicly available.
C. Neidle and R.G. Lee, The SignStream Project: Available Resources and Future Directions. WORKSHOP ON THE REPRESENTATION AND PROCESSING OF SIGN LANGUAGES From SignWriting to Image Processing. Information techniques and their implications for teaching, documentation and communication. Workshop on the occasion of the 4th International Conference on Language Resources and Evaluation (LREC 2004). Lisbon, Portugal. May 30, 2004.
C. Neidle and R.G. Lee, Syntactic agreement across language modalities: American Sign Language, Lisbon Workshop on Agreement, Universidade Nova de Lisboa, July 10-11, 2003.
R.G. Lee and C. Neidle, New developments in the annotation of gesture and sign language data: Advances in the SignStream project. The 5th International Workshop on Gesture and Sign Language based Human-Computer Interaction, Genova, Italy. April 15-17, 2003. Click here for pdf abstract.
C. Neidle and R.G. Lee, New developments in the annotation of gesture and sign language data: Advances in the SignStream project. University of Palermo, Italy. April 14, 2003 at 10.30.
C. Neidle and R.G. Lee, Questions and Conditionals in American Sign Language: Focusing on the left and right periphery, Università Milano-Bicocca, Italy. April 10, 2003 at 13.30.
Like other signed languages, ASL makes critical use of non-manual markings (gestures of the face and upper body) to express syntactic information. Quite a few different constructions, including 'if' clauses, 'when' clauses, focused NPs, and relative clauses, are characterized by very similar non-manual markings. Not concidentally, all of these phrases normally occur in a sentence-initial position and share some interesting semantic and syntactic commonalities. We argue here that this position and grammatical marking are associated with Focus. The proposed analysis further provides an explanation (in terms of Rizzi's Relativized Minimality) for previously puzzling semantic and syntactic differences between wh-questions in which the wh-phrase does or does not undergo movement to the right periphery of the clause.
C. Neidle, with Fran Conlin and Lana Cook. Focus and Wh Constructions in ASL. Harvard University GSAS Workshop Series on Comparative Syntax and Linguistic Theory. Cambridge, MA. May 17, 2002.Wh-question constructions in American Sign Language (ASL) containing in situ wh phrases and moved wh phrases appearing at the right periphery of the clause will be compared. A syntactic account involving distinct projections for focus and wh phrases will be proposed. The differences in interpretation of the two constructions will be argued to follow from Relativized Minimality. Evidence in support of this analysis comes from (1) certain restrictions on the occurrence of wh-movement, and (2) the distribution of an indefinite focus particle in ASL.
The findings to be discussed have emerged from joint work with Ben Bahan, Sarah Fish, and Paul Hagstrom, supported in part by grants from the National Science Foundation.
C. Neidle. Visual Analysis of Signed Language Data. Rutgers Center for Cognitive Science Colloquium. April 23, 2002.Research on recognition and generation of signed languages and the gestural component of spoken languages has been held back by unavailability of large-scale linguistically annotated corpora of the kind that led to significant advances in the area of spoken language. A major obstacle to the production of such corpora has been the lack of computational tools to assist in efficient transcription and analysis of visual language data.
This talk will begin with a discussion of a few of the interesting linguistic characteristics of language in the visual-gestural modality that pose a challenge for sign language recognition. Next, there will be a demonstration of SignStream, a computer program that we have designed to facilitate the linguistic analysis of visual language data. A substantial amount of high quality video data -- including multiple synchronized views of native users of American Sign Language (ASL) -- is being collected in our Center for Sign Language and Gesture Resources. These data, along with linguistic annotations produced through the use of SignStream, are being made publicly available in a variety of formats.
The second part of this talk will present some results of collaborative research now in progress between linguists and computer scientists in the development of computer vision algorithms to detect linguistically significant aspects of signing and gesture. The visual analysis of the sign language data will be discussed and relevant video examples will be presented.
The projects described here have been carried out in collaboration with Stan Sclaroff (Boston University) and Dimitris Metaxas (Rutgers), among other researchers, and they have been supported by grants from the National Science Foundation. Further information is available at http://www.bu.edu/asllrp/.
C. Neidle. American Sign Language: Linguistic Perspectives. Rutgers University Language and Linguistics Club. April 22, 2002.A general overview of the linguistic organization of American Sign Language (ASL) will be presented. It will be argued that the fundamental linguistic properties of signed and spoken languages are the same, despite some interesting differences attributable to modality. SignStream, a computer program that we have designed to facilitate linguistic analysis of visual language data, will be demonstrated briefly. I will also report on ongoing collaborative research with computer scientists at Boston University (Stan Sclaroff) and at Rutgers (Dimitris Metaxas) and demonstrate some of our latest results.
C. Neidle. Master Class for linguistics faculty: The Linguistics of American Sign Language. Rutgers University, April 22, 2002.
C. Neidle, Linguistic annotation and analysis of signed language using SignStream. Invited presentation at the IEEE* International Workshop on Cues in Communication ("Cues 2001") held in Conjunction with CVPR'2001. Kauai, Hawaii, USA. December 9, 2001.
C. Neidle, SignStream as a tool for linguistic annotation of signed language data. TalkBank Gesture Workshop, Carnegie Mellon University, Pittsburgh, PA, October 26-28, 2001.
C. Neidle, SignStream, un logiciel de base de données qui facilite l'encodage et l'analyse linguistique de données visuo-gestuelles. Transcription de la parole normale et pathologique - avec une session spéciale sur la transcription des langues des signes et du gestuel. l'Université de Tours, France. December 8 and 9, 2000.
C. Neidle and S. Sclaroff, SignStream: A tool for linguistic and computational research on visual-gestural language data. Third International Conference on Methods and Techniques in Behavioral Research. Nijmegen, The Netherlands, August 15-18, 2000.Research on recognition and generation of signed languages and the gestural component of spoken languages has been held back by the unavailability of large-scale linguistically annotated corpora of the kind that led to significant advances in the area of spoken language. A major obstacle to the production of such corpora has been the lack of computational tools to assist in efficient analysis and transcription of visual language data.
The first part of this talk will present SignStream, a computer program that we have designed to facilitate the transcription and linguistic analysis of visual language data. SignStream provides a single computing environment for manipulating digital video and linking specific frame sequences to simultaneously occurring linguistic events encoded in a fine-grained multi-level transcription. Items from different fields are visually aligned on the screen to reflect their temporal relations, as illustrated in Figure 1.
Figure 1. SignStream: video and gloss windows.We will describe the capabilities of the current release--which is distributed on a non-profit basis to educators and researchers--as well as additional features currently under development.
Although SignStream may be of use for the analysis of any visual language data (including data from signed languages as well as the gestural component of spoken languages), we have been using the program primarily to analyze data from American Sign Language (ASL). This has resulted in a growing corpus of linguistically annotated ASL data (as signed by native signers). In the second part of this talk, we will discuss the ways in which the annotated corpus is being used in the development and refinement of computer vision algorithms to detect linguistically significant aspects of signing and gesture. This research is being conducted within the context of the National Center for Sign Language and Gesture Resources, which has established state-of-the-art digital video data collection facilities at Boston University and the University of Pennsylvania. Each lab is equipped with multiple synchronized digital cameras (see Figure 2) that capture different views of the subject (see Figure 3).
Figure 2. National Center for Sign Language and Gesture Resources:
data collection facility at Boston University.
Figure 3. Three views of a signer.The video data collected in this facility are being made publicly available in multiple video file formats, along with the associated linguistic annotations.
The projects described here have been supported by grants from the National Science Foundation.
C. Neidle, SignStream, A tool for crosslinguistic analysis of signed languages. Seventh International Conference on Theoretical Issues in Sign Language Research (TISLR 2000). Amsterdam, The Netherlands. July 23-27, 2000.
D. MacLaughlin, Syntactic research on American Sign Language: The state of the art. Applied linguistics research seminar. Boston University, October 25, 1999.
C. Neidle and D. MacLaughlin, Language in another dimension: American Sign Language. Dartmouth College, Hanover, NH, October 22, 1999.
C. Neidle, D. MacLaughlin, and R.G. Lee, Language in the visual modality: Linguistic research and computational tools. University of Pennsylvania, May 7, 1999.
This talk focuses on issues surrounding linguistic research on signed languages and presents a demonstration of a computational tool to facilitate analysis of visual language data. We begin with an overview of the linguistic organization of American Sign Language. Next, we address particular problems associated with the collection, analysis, and dissemination of sign language data. We then present a demonstration of SignStream (http://www.bu.edu/asllrp/SignStream), a multimedia database tool currently under development as part of the American Sign Language Linguistic Research Project (http://www.bu.edu/asllrp).
SignStream-encoded language data will be made publicly available as part of a collaborative venture with the University of Pennsylvania to establish a National Center for Sign Language and Gesture Resources (U Penn: D. Metaxas, N. Badler, M. Liberman; Boston U: C. Neidle, S. Sclaroff).
C. Neidle, SignStream: A Multimedia Tool for Language Research. Poster presented at the National Science Foundation Human Computer Interaction Grantees' Workshop '99, Orlando, Florida, February 22, 1999.
C. Neidle and D. MacLaughlin, The Distribution of Functional Projections in ASL: Evidence from overt expressions of syntactic features. Workshop: The Mapping of functional projections. Università di Venezia and Venice International University, Isola di San Servolo, Venice, Italy, January 29, 1999.
C. Neidle, R.G. Lee, D. MacLaughlin, B. Bahan, and J. Kegl, SignStream and the American Sign Language Linguistic Research Project. University of Stellenbosch, South Africa, February 19, 1997.
C. Neidle, J. Kegl, B. Bahan, D. MacLaughlin, and R. G. Lee, Non-manual Grammatical Marking as Evidence for Hierarchical Relations in American Sign Language. Symposium presented at the Fifth International Conference on Theoretical Issues in Sign Language Research, Montreal. September 21, 1996.C. Neidle, J. Kegl, D. MacLaughlin, R. G. Lee, and B. Bahan, The Distribution of Non-Manual Correlates of Syntactic Features: Evidence for the Hierarchical Organization of ASL
B. Bahan, C. Neidle, D. MacLaughlin, J. Kegl, and R. G. Lee, Non-Manual Realization of Agreement in the Clause
D. MacLaughlin, C. Neidle, B. Bahan, J. Kegl, and R. G. Lee, Non-Manual Realization of Agreement in the Noun Phrase
C. Neidle, J. Kegl, D. MacLaughlin, B. Bahan, and R. G. Lee, Demonstration of SignStream as part of Technology Displays at the Fifth International Conference on Theoretical Issues in Sign Language Research, Montreal. September 20, 1996.
C. Neidle, J. Kegl, D. MacLaughlin, B. Bahan, R. G. Lee, J. Hoza, O. Foelsche, and D. Greenfield, SignStream: A Multimedia Database Tool for Linguistic Analysis. Poster session and software demonstration presented at the Fifth International Conference on Theoretical Issues in Sign Language Research, Montreal. September 19, 1996.
D. MacLaughlin, C. Neidle, B. Bahan, and J. Kegl, SignStream: A Multimedia Database for Sign Language Research. Talk and demonstration presented at the Human Movement Coding Workshop at City University, London, England, May 29, 1996.
D. MacLaughlin, C. Neidle, B. Bahan, and J. Kegl, The SignStream Project. Talk and demonstration presented at the University of Durham, England, May 28, 1996.
B. Bahan, C. Neidle, D. MacLaughlin, and J. Kegl, Non-Manual Realization of Agreement in American Sign Language. Talk presented at the University of Durham, England, May 27, 1996.
C. Neidle, D. MacLaughlin, B. Bahan, and J. Kegl, Non-Manual Correlates of Syntactic Agreement in American Sign Language. Talk presented at the University of Iceland, Reykjavik, May 23, 1996.
B. Bahan, J. Kegl, and C. Neidle, Introduction to Syntax for Sign Language Teachers; and The Spread of Grammatical Facial Expressions and Evidence of the Structure of Sentences in ASL. New York Statewide Conference for Teachers of American Sign Language. June 4, 1994.
C. Neidle, D. MacLaughlin, J. Kegl and B. Bahan, SignStream: A Multimedia Database for Sign Language Research. Talk presented at the University of Iceland, Reykjavik, May 22, 1996.
B. Bahan, ASL Literary Traditions. Talk presented at the University of Virginia, Charlottesville, April 12, 1996.
J. Kegl, B. Bahan, O. Foelsche, David Greenfield, J. Hoza, D. MacLaughlin, and C. Neidle, SignStream: A Multimedia Tool for Language Analysis. Talk presented at the Conference on Gestures Compared Cross-Linguistically, 1995 Linguistic Institute, University of New Mexico, Albuquerque, NM, July 10, 1995.
J. Kegl, B. Bahan, D. MacLaughlin, and C. Neidle, Overview of Arguments about Rightward Movement in ASL. Talk presented in the course on the Linguistic Structure of American Sign Language at the 1995 Linguistic Institute, University of New Mexico, Albuquerque, NM, July 11, 1995.
C. Neidle, D. MacLaughlin, J. Kegl, B. Bahan, and D. Aarons, Overt Realization of Syntactic Features in American Sign Language. Syntax Seminar, University of Trondheim, Trondheim, Norway, May 30, 1995.
B. Bahan, J. Kegl, D. MacLaughlin, and C. Neidle, Convergent Evidence for the Structure of Determiner Phrases in American Sign Language. Formal Linguistics Society of Mid-America, Bloomington, IN, May 19-21, 1995.
C. Neidle, J. Kegl, and B. Bahan, The Architecture of Functional Categories in American Sign Language, Harvard University Linguistics Department Colloquium, Cambridge, MA, May 2, 1994.
J. Shepard-Kegl, C. Neidle, and J. Kegl, Legal Ramifications of an Incorrect Analysis of Tense in ASL. American Association for Applied Linguistics, Baltimore, MD, March 8, 1994.
C. Neidle and J. Kegl, The Architecture of Functional Categories in American Sign Language. Syracuse University Linguistics Colloquium, Syracuse, NY, December 10, 1993.
D. Aarons, B. Bahan, J. Kegl, and C. Neidle, Tense and Agreement in American Sign Language. Fourth International Conference on Theoretical Issues in Sign Language Research, University of California, San Diego, August 5, 1992.
D. Aarons, J. Kegl, and C. Neidle, Subjects and Agreement in American Sign Language. Fifth International Symposium on Sign Language Research, Salamanca, Spain, May 26, 1992.