American Sign Language Lexicon Video Dataset (ASLLVD)

In conjunction with NSF grant #0705749, "HCC: Large Lexicon Gesture Representation, Recognition, and Retrieval" (Stan Sclaroff, Carol Neidle, and Vassilis Athitsos -- with invaluable contributions from PhD students Ashwin Thangali and Joan Nash, among other students assisting with the project), video examples have been collected at Boston University, from up to 6 native ASL signers, for lexical items most of which are contained within the Gallaudet Dictionary of American Sign Language.

Video stimuli were presented to signers, who were asked to produce the sign they saw as they would naturally produce it. In cases where the signer reported that he or she does not normally use that sign, we did not elicit the sign from this signer. The video stimuli for elicitation were supplemented to include additional signs that were not in the dictionary. Multiple synchronized views of the sign production were captured using high-quality video cameras.

It is interesting to note that signers did not always produce the same sign that was shown in the prompt. In cases where a signer recognized and understood that sign but used a different sign or a different version of the same sign, divergences showed up in the data set. So, in reality, a given stimulus resulted in productions that may have varied in any of several different ways: production of a totally different but synonymous sign; production of a lexical variant of the same sign; production of essentially the same sign but differing in subtle ways with respect to the articulation (as a result of regular phonological processes).

We have developed framework for annotation that affords us the ability to carefully delineate the variants attested in this dataset. We annotate distinct signs with distinct gloss labels (consistent with the gloss labels in use for our other data sets, cf. http://secrets.rutgers.edu/dai/queryPages/ ). Linguistic annotation has been carried out using a beta version of SignStream® version 3 (a Java reimplementation of  SignStream version 2.2.2, which runs as a Macintosh Classic application) that provides capabilities for phonological annotation. Start and end frames were identified for each sign, and the handshapes used for each of the hands at the start and end of the sign were also annotated from this set: http://www.bu.edu/asllrp/cslgr/pages/handshape-palette.html.

As of the end of November 2011, we are in the final stages of linguistic annotation and verification of the annotations, which is a time-intensive challenge facilitated by a tool developed for this purpose by Ashwin Thangali. As soon as the verifications are complete, this annotated data set, complete with unique gloss labels and start and end handshaps for all signs, will be shared publicly.

Although the counts may change slightly as verifications occur, this is the current overview of the data that have been annotated to date:

data summary

To make it clear how this chart should be read, a total of 2,620 stimuli corresponding to monomorphemic lexical signs was shown to subjects. The resulting productions included examples of 2,990 distinct signs. For 1,021 of those signs, we have examples from a single signer; for 731 of them, we have examples from 2 signers, etc., and for 125 of those signs, we have examples from all 6 of our native signers. Since we have more than one example from a given signer in some cases, the total number of tokens per sign may be greater than the total number of signers whose productions of that sign are included in our data set. In fact, for 121 of the signs, we have more than 6 tokens. (For 2 of the signs, we have as many as 19 tokens.)

Note that in some cases, a given stimulus resulted in the production of both 1-handed and 2-handed versions of the sign, which is why the number of stimuli resulting in production of 1-handed signs (1768) plus the number resulting in production of 2-handed signs (973) is slightly greater than the total number of stimuli (2620).

The data that will be made available will include a list of the variants for each sign in the set, including the start and end handshapes -- on both the domminant and non-dominant hands -- for each production, with links to the movie files, which will also be available for download. This is illustrated here:

examples of variants

 

In this case, the sign for 'accident' has three lexical variants, which are distinguished by handshape but which have otherwise the same basic movement. These are considered to be lexcial variants and they have distinct glosses, in this case with the distinguishing handshape noted as part of the gloss label (although that is not necessarily the case for lexical variant glosses; general glossing conventions are documented in reports 11 and 13, with an update now in progress).

See illustration of start and end handshapes for these three variants.

In some cases, the alternation in handshape, e.g., between the A and S hand shapes shown for the end hand shapes of (5)ACCIDENT (see this chart for explanation of the handshape labels: http://www.bu.edu/asllrp/cslgr/pages/handshape-palette.html), is quite productive under appropriate phonological conditions and is not a property associated specifically with this lexical item.

The availability of such data for our 9,000+ tokens will provide extremely valuable material for study of the statistical distribution of handshapes, the types and frequencies of variations that occur, and the dependencies between the handshapes on the two hands and for start and end handshapes.

Further information about the release of this data set will be provided in the near future.

In the meantime, a preliminary set of video files collected at the outset of this project, identified only by the stimulus video that was used to obtain the sign (i.e., without unique gloss labels and without annotations of handshapes), is available from this page, where further information about the video file formats will also be found:

http://csr.bu.edu/asl/asllvd/annotate/index-cvpr4hb08-dataset.html

 

Reports related to this project

V. Athitsos, C. Neidle, S. Sclaroff, J. Nash, A. Stefan, Q. Yuan and A. Thangali (2008) "The ASL Lexicon Video Dataset", CVPR 2008 Workshop on Human Communicative Behaviour Analysis (CVPR4HB'08) (pdf ps)

H. Wang, A. Stefan, S. Moradi, V. Athitsos, C. Neidle, and F. Kamanga (2010) "A System for Large Vocabulary Sign Search," Proceedings of the Workshop on Sign, Gesture and Activity (SGA), September 2010. (pdf)

A. Thangali, J.P. Nash, S. Sclaroff and C. Neidle (2011) "Exploiting Phonological Constraints for Handshape Inference in ASL Video," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol., 2011. (pdf)
 

Related resources

Contacts

For queries related to ASL data collection and linguistic annotations:

carol AT bu.edu

For questions regarding data capture and video file formats:

athitsos AT uta.edu & sclaroff AT cs.bu.edu