Showing 129-136 of 262 results

JAXA Earth Observation Data

Creators: Japan Aerospace Exploration Agency (JAXA)
Publication Date: 05/2006
Creators: Japan Aerospace Exploration Agency (JAXA)

The Japan Aerospace Exploration Agency (JAXA) provides access to satellite data from JAXA’s Earth observation missions, including high-resolution imagery and various environmental parameters. The dataset encompasses over 1,000 TB  of archived satellite data, making it one of the most comprehensive collections of Earth observation data. JAXA operates several Earth observation satellites that provide global environmental and geospatial data. It includes data from multiple JAXA missions, such as ALOS-2 (land observation), GOSAT (greenhouse gases), GCOM-W (water cycle), GCOM-C (climate monitoring), and Himawari (weather observations). The dataset covers several decades of observations, with some datasets extending back to the 1980s.

Key features include:

  • Atmospheric Monitoring: Tracks greenhouse gas levels (CO₂, CH₄), aerosols, water vapor, and cloud properties.
  • Land Observations: Provides land surface temperature, vegetation indices (NDVI, EVI), urban growth mapping, and disaster response data.
  • Oceanic Measurements: Covers sea surface temperature, ocean color, chlorophyll concentration, and sea ice extent.
  • Climate Change Indicators: Tracks long-term changes in Earth’s environment.
  • Disaster Response & Early Warning: Used for real-time monitoring of typhoons, wildfires, and earthquakes.

The total number of observations varies by mission and parameter. Given that JAXA satellites capture data daily, weekly, or monthly, the dataset consists of millions of individual observations spanning decades. For instance:

  • GOSAT greenhouse gas monitoring: Over 56,000 global observation points.
  • ALOS-2 terrain and land cover changes: Millions of SAR images worldwide.
  • GCOM-W & GCOM-C climate data: Continuous Earth monitoring at global scale.

NASA Earth Observation (NEO)

Creators: NASA (National Aeronautics and Space Administration)
Publication Date: NA
Creators: NASA (National Aeronautics and Space Administration)

NASA Earth Observation (NEO) is a treasure trove of satellite imagery and remote sensing data. NEO offers daily, weekly, and monthly images captured by various NASA satellites, such as Terra and Aqua. Users can access a wealth of information on climate change, natural disasters, and global ecosystems. This resource is designed to support research, education, and policy-making by delivering accessible and up-to-date Earth observation information. NEO offers data across multiple categories, including:

  • Atmosphere: Parameters such as aerosol optical thickness, carbon monoxide levels, cloud properties, nitrogen dioxide concentrations, ozone levels, rainfall, and water vapor content.
  • Energy: Data on albedo, land surface temperature (day and night), net radiation, outgoing longwave radiation, reflected shortwave radiation, sea surface temperature, and solar insolation.
  • Land: Information on active fires, land cover classification, leaf area index, snow cover, and vegetation index (NDVI).
  • Life: Metrics such as net primary productivity.
  • Ocean: Sea surface temperature data.
  • User-Friendly Access: The NEO platform provides an intuitive interface, allowing users to visualize, download, and analyze data without requiring specialized software or expertise.

 

Instagram Posts from Football Players

Creators: Klostermann, Jan
Publication Date: 2023
Creators: Klostermann, Jan

This dataset includes information on 334,071 Instagram posts from 1,435 male professional football players that were under contract at any of the 56 clubs in the English Premier League, the Spanish La Liga, and the German Bundesliga. The data was colleced December 31th, 2019 and includes the whole history of Instagram posts up to that point in time.

The information provided in the dataset are the following:

  • Player information: Information on each of the football player in the dataset is collected from http://www.transfermarkt.de and includes club, position, market value (at the time of collecting the data), highest market value, and the year in which highest market value was observed. Further, the Instagram account name is provided.
  • Instagram post information: Information on the Instagram posts including the shortcode (which can be used to open the post on instagram.com), date, caption text, number of likes, number of comments, post type (image, sidecar, video).
  • Instagram post images: For each post, we analyzed the content of the image (first image for sidecar posts, first frame for video posts) using Google Vision and extract the number of persons, their age, and their gender. Further, we extract all tags that are included in the image, such as “soccer” or “car”.
  • Additional information: Additional information such as the images of the posts can be requested from the authors.

The dataset has been used in the following paper:

Klostermann, J., Meißner, M., Max, A., & Decker, R. (2023). Presentation of celebrities’ private life through visual social media. Journal of Business Research, 156, 113524.

Please cite the paper when using the dataset for your own research. It is recommended to read the paper for further information on the dataset.

The Economist Historical Advertisements - Master Dataset

Creators: Kluge, Stefan; Gehrmann, Leonie; Stahl, Florian
Publication Date: 2023
Creators: Kluge, Stefan; Gehrmann, Leonie; Stahl, Florian

This dataset contains metadata of 512.599 historical advertisements from all 8,840 issues of The Economist magazine, years 1843 to 2014. It is part of a series of datasets related to The Economist Historical Archive (https://www.gale.com/intl/c/the-economist-historical-archive). You will need this Master Dataset, if you want to work with any of the related datasets. Each advertisement entry includes various metadata fields such as publication date, issue number, page number, and advertisement dimensions. This structured information enables detailed analyses of trends and patterns within the advertising practices over time. In total, the dataset has a size of 195,4 MB.

MLW Zettelmaterial

Creators: Bayerische Akademie der Wissenschaften (BAdW)
Publication Date: 2023
Creators: Bayerische Akademie der Wissenschaften (BAdW)

General information:

The data set comprises a total of 114,653 images (18,9 GB), corresponding to 3,507 distinct lemmas.
All images are in RGB, but not uniform in size, i.e. height, and width differ from image to image.
Additionally, the information on the corresponding lemma is available for each image in a separate json file.

Structure:

Most record cards follow the same structure being composed of three main parts.

  • The first one (1), and the one deemed most challenging, is the lemma, which is always located in the upper left corner of the record card.
  • The second part (2) is the index of the text where the lemma is found.
  • The third part (3) contains a text extract in which the word (corresponding to the lemma) occurs in context.

Character inventory:

There is a total of 17 different first letters, eight of which are each upper- and lowercase, as well as one special character.
The capitalization of a word plays a crucial role since a word’s meaning changes depending on capitalization.
Since the majority of our data stems from the S-series of the dictionary, most lemmas start with the letter “s”.
Likewise, a larger number of lemmas also starts with “m”, “v”, “t”, “u”, “l”, and “n”.

Occurrence frequencies:

  • A total of 2,420 lemmas (69%) were found to appear on ten record cards or less
  • 854 lemmas (24.4%) are present on between 10 and 100 record cards
  • 233 lemmas (6.6%)can be found on more than 100 record cards
  • 1,123 lemmas (approximately 36.7%) had only one record card

Lengths:

  • Lemma lengths range from one character up to a maximum of 19 characters.
  • The average length of the lemmas lies between five and six characters.

Availability:

Research activity:

  • Koch, P., Nuñez, G. V., Arias, E. G., Heumann, C., Schöffel, M., Häberlin, A., & Aßenmacher, M. (2023). A tailored Handwritten-Text-Recognition System for Medieval Latin. arXiv preprint arXiv:2308.09368.

The Economist Historical Advertisements - Faces Dataset

Creators: Kluge, Stefan
Publication Date: 2023
Creators: Kluge, Stefan

This dataset contains 116.746 identified faces (bounding box location on image, predicted age and gender) for all historical advertisements from all 8,840 issues of The Economist magazine, years 1843 to 2014. Faces have been detected using the following library:  https://pythonrepo.com/repo/timesler-facenet-pytorch-python-deep-learning. You will need the The Economist Historical Advertisements – Master Dataset as well, to work with the data. In total, the dataset has a size of 20,2 MB and is organized as follows:

  • Filename: A unique identifier corresponding to each advertisement where a face has been detected. This identifier links directly to the specific advertisement within The Economist archives.
  • Bounding Box Coordinates:

    • Bounding Box relative X1 and Y1: These values represent the top-left corner coordinates of the bounding box encapsulating the detected face, expressed as proportions relative to the image dimensions.
    • Bounding Box relative X2 and Y2: These values denote the bottom-right corner coordinates of the bounding box, also as relative proportions. To determine the absolute pixel coordinates, multiply these relative values by the image’s width and height, respectively.
  • Segmentation Confidence Score: A numerical value indicating the confidence level of the neural network algorithm that the identified bounding box indeed contains a face. Higher scores reflect greater confidence in accurate face detection.

  • Size Relative: A metric indicating the proportion of the advertisement occupied by the detected face. For example, a value of 1 signifies that the face covers the entire advertisement, while 0.5 indicates it covers half.

  • Predicted Age: An estimated age of the individual based on facial analysis performed by the detection algorithm.

  • Gender Probability: A probability score representing the likelihood of the detected face being female. A value of 0 indicates male, 1 indicates female, and intermediate values (e.g., 0.4) suggest a 40% likelihood of the individual being female

The Economist Historical Advertisements - Objects Dataset

Creators: Kluge, Stefan
Publication Date: 2023
Creators: Kluge, Stefan

This dataset contains 191.994 identified object locations and classes for all historical advertisements from all 8,840 issues of The Economist magazine, years 1843 to 2014. We used a state of the art classifier to detect the objects: https://tfhub.dev/google/openimages_v4/ssd/mobilenet_v2/1. You will need the The Economist Historical Advertisements – Master Dataset, as well, to work with the data. The dataset has a size of 29,8 MB.

Creators: Kluge, Stefan
This dataset is a specialized collection of metadata from advertisements related to the banking industry, extracted from The Economist magazine issues spanning 1843 to 2014. It contains metadata of 92,592 historical advertisements from the banking industry, from all 8,840 issues of The Economist magazine, years 1843 to 2014. It is part of a series of  datasets related to The Economist Historical Archive (https://www.gale.com/intl/c/the-economist-historical-archive). In total, the dataset has a size of 136,0 MB. Each advertisement entry includes various metadata fields such as publication date, issue number, page number, and advertisement dimensions. This structured information enables detailed analyses of trends and patterns within the banking industry’s advertising practices.

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.