MegaPixels
UCCS
One of 16,149 images form the UnConstrained College Students face recognition dataset captured at University of Colorado, Colorado Springs

UnConstrained College Students

Update: In response to this report and its previous publication of metadata from UCCS dataset photos, UCCS has temporarily suspended its dataset, but plans to release a new version.

UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs developed primarily for research and development of "face detection and recognition research towards surveillance applications" 1.

According to the authors of two papers associated with the dataset, over 1,700 students and pedestrians were "photographed using a long-range high-resolution surveillance camera without their knowledge". 3 This analysis examines the UCCS dataset contents of the dataset, its funding sources, timestamp data, and information from publicly available research project citations.

The UCCS dataset includes over 1,700 unique identities, most of which are students walking to and from class. In 2018, it was the "largest surveillance [face recognition] benchmark in the public domain." 4 The photos were taken during the spring semesters of 2012 – 2013 on the West Lawn of the University of Colorado Colorado Springs campus. The photographs were timed to capture students during breaks between their scheduled classes in the morning and afternoon during Monday through Thursday. "For example, a student taking Monday-Wednesday classes at 12:30 PM will show up in the camera on almost every Monday and Wednesday." 2.

 The location at University of Colorado Colorado Springs where students were surreptitiously photographed with a long-range surveillance camera for use in a defense and intelligence agency funded research project on face recognition. Image: Google Maps
The location at University of Colorado Colorado Springs where students were surreptitiously photographed with a long-range surveillance camera for use in a defense and intelligence agency funded research project on face recognition. Image: Google Maps

The long-range surveillance images in the UnConsrained College Students dataset were taken using a Canon 7D 18-megapixel digital camera fitted with a Sigma 800mm F5.6 EX APO DG HSM telephoto lens and pointed out an office window across the university's West Lawn. The students were photographed from a distance of approximately 150 meters through an office window. "The camera [was] programmed to start capturing images at specific time intervals between classes to maximize the number of faces being captured." 2 Their setup made it impossible for students to know they were being photographed, providing the researchers with realistic surveillance images to help build face recognition systems for real world applications for defense, intelligence, and commercial partners.

 Example images from the UnConstrained College Students Dataset. Photos from UnConstrained College Students dataset, made available under a modified ODC Attribution License http://www.vast.uccs.edu/UCCS/License.txt
Example images from the UnConstrained College Students Dataset. Photos from UnConstrained College Students dataset, made available under a modified ODC Attribution License http://www.vast.uccs.edu/UCCS/License.txt

The EXIF data embedded in the images shows that the photo capture times follow a similar pattern to that outlined by the researchers, but also highlights that the vast majority of photos (over 7,000) were taken on Tuesdays around noon during students' lunch break. The lack of any photos taken between Friday through Sunday shows that the researchers were only interested in capturing images of students during the peak campus hours.

 UCCS photos captured per weekday. Contains information from UCCS: UnConstrained College Students dataset, made available under a modified ODC Attribution License http://www.vast.uccs.edu/UCCS/License.txt
UCCS photos captured per weekday. Contains information from UCCS: UnConstrained College Students dataset, made available under a modified ODC Attribution License http://www.vast.uccs.edu/UCCS/License.txt
 UCCS photos captured per weekday. Contains information from UCCS: UnConstrained College Students dataset, made available under a modified ODC Attribution License
UCCS photos captured per weekday. Contains information from UCCS: UnConstrained College Students dataset, made available under a modified ODC Attribution License

The two research papers associated with the release of the UCCS dataset (Unconstrained Face Detection and Open-Set Face Recognition Challenge and Large Scale Unconstrained Open Set Face Database), acknowledge that the primary funding sources for their work were United States defense and intelligence agencies. Specifically, development of the UnContsrianed College Students dataset was funded by the Intelligence Advanced Research Projects Activity (IARPA), Office of Director of National Intelligence (ODNI), Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative (ONR MURI), and the Special Operations Command and Small Business Innovation Research (SOCOM SBIR) amongst others. UCCS's VAST site also explicitly states their involvement in the IARPA Janus face recognition project developed to serve the needs of national intelligence, establishing that benefactors of this dataset include United States defense and intelligence agencies, but it would go on to benefit other similar organizations.

In 2017, one year after its public release, the UCCS face dataset formed the basis for a defense and intelligence agency funded face recognition challenge project at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was again used for the 2nd Unconstrained Face Detection and Open Set Recognition Challenge at the European Computer Vision Conference (ECCV) in Munich, Germany.

As of April 15, 2019, the UCCS dataset is no longer available. But during the time it was publicly available (2016 – 2019, based on publicly available resaearch citations) the UCCS dataset appeared in at least 4 research papers including usage from Beihang University who is known to provide research and development for China's military; and Vision Semantics Ltd who lists the UK Ministry of Defence as a project partner.

Updates

June 2, 2019: An email exchange with the author, Professor Terrance Boult, clarified that the he "did not provide data to any government agency when they collected it, nor does it appear that any US Government agency had ever downloaded it as part of the reserach competition." The funding was provided to assess the state of the art technology in face recognition. 5

However, this type of technology is data-driven and advancements are often derived in part from the dataset, as well as the author's own technical contributions.

Who used UCCS?

The bar chart below presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.

Information Supply Chain

To help understand how UCCS has been used around the world by commercial, military, and academic organizations; existing publicly available research citing UnConstrained College Students Dataset was collected, verified, and geocoded to show how AI training data has proliferated around the world. Click on the markers to reveal research projects at that location.

Citation data is collected using SemanticScholar.org then dataset usage verified and geolocated. Citations are used to provide overview of how and where images were used.

Dataset Citations

The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.

Supplementary Information

Since this site To show the types of face images used in the UCCS student dataset while protecting their individual privacy, a generative adversarial network was used to interpolate between identities in the dataset. The image below shows a generative adversarial network trained on the UCCS face bounding box areas from 16,000 images and over 90,000 face regions.

 GAN generated approximations of students in the UCCS dataset. © megapixels.cc 2018. Based on the UnConstrained College Students dataset, made available under a modified ODC Attribution License. Rendered using <a href="https://github.com/tkarras/progressive_growing_of_gans">Progressive Growing of GANs for Improved Quality, Stability, and Variation</a>
GAN generated approximations of students in the UCCS dataset. © megapixels.cc 2018. Based on the UnConstrained College Students dataset, made available under a modified ODC Attribution License. Rendered using Progressive Growing of GANs for Improved Quality, Stability, and Variation

UCCS photos taken in 2012

Date Photos
Feb 23, 2012 132
March 6, 2012 288
March 8, 2012 506
March 13, 2012 160
March 20, 2012 1,840
March 22, 2012 445
April 3, 2012 1,639
April 12, 2012 14
April 17, 2012 19
April 24, 2012 63
April 25, 2012 11
April 26, 2012 20

UCCS photos taken in 2013

Date Photos
Jan 28, 2013 1,056
Jan 29, 2013 1,561
Feb 13, 2013 739
Feb 19, 2013 723
Feb 20, 2013 965
Feb 26, 2013 736

Location

The location of the camera and subjects was confirmed using several visual cues in the dataset images: the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The original papers also provides another clue: a picture of the camera inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the location on Google Maps

 3D view showing the angle of view of the surveillance camera used for UCCS dataset. Image: Google Maps
3D view showing the angle of view of the surveillance camera used for UCCS dataset. Image: Google Maps

Funding

The UnConstrained College Students dataset is associated with two main research papers: "Large Scale Unconstrained Open Set Face Database" and "Unconstrained Face Detection and Open-Set Face Recognition Challenge". Collectively, these papers and the creation of the dataset have received funding from the following organizations:

Opting Out

If you attended University of Colorado Colorado Springs and were captured by the long range surveillance camera used to create this dataset, there is unfortunately currently no way to be removed. The authors do not provide any options for students to opt-out nor were students informed they would be used for training face recognition. According to the authors, the lack of any consent or knowledge of participation is what provides part of the value of Unconstrained College Students Dataset.

Ethics

Credits

This analysis contains information from UCCS: UnConstrained College Students dataset, made available under a modified ODC Attribution License http://www.vast.uccs.edu/UCCS/License.txt

Cite Our Work

If you find this analysis helpful, please cite our work:

@online{megapixels,
  author = {Harvey, Adam. LaPlace, Jules.},
  title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
  year = 2019,
  url = {https://megapixels.cc/},
  urldate = {2019-04-18}
}

References

  • 1 a"2nd Unconstrained Face Detection and Open Set Recognition Challenge." https://vast.uccs.edu/Opensetface/. Accessed April 15, 2019.
  • 2 abSapkota, Archana and Boult, Terrance. "Large Scale Unconstrained Open Set Face Database." 2013.
  • 3 aGünther, M. et. al. "Unconstrained Face Detection and Open-Set Face Recognition Challenge," 2018. Arxiv 1708.02337v3.
  • 4 aCheng et. al. Surveillance Face Recognition Challenge. 2018. https://arxiv.org/abs/1804.09691
  • 5 aEmail exchange on June 2, 2019.