Showing 41-43 of 43 results

Stock return prediction with tweets

Creators: Madhyastha, Pranava; Sowinska, Karolina
Publication Date: 2020
Creators: Madhyastha, Pranava; Sowinska, Karolina

This dataset is designed to analyze the impact of Twitter-based textual information on stock returns. Compiled by researchers Karolina Sowinska and Pranava Madhyastha, this dataset was published in 2020 and is made available under the GNU General Public License v3.0 or later. It provides valuable data for financial analytics and natural language processing, particularly in studying the relationship between social media sentiment and stock market performance. By linking tweets to stock return data, the dataset enables the development of predictive models for stock movement based on public sentiment. The dataset comprises 862,231 labeled tweets, all in English, each associated with specific companies. These tweets serve as samples for analyzing public opinion and sentiment regarding different stocks and financial events. A cleaned subset of 85,176 labeled instances is also included, making the dataset suitable for both large-scale machine learning models and more focused analyses. Each tweet is linked to corresponding stock return data, allowing for a company-level examination of how Twitter sentiment impacts one-day, two-day, three-day, and seven-day stock returns. This structured linkage between tweets and financial performance provides a unique opportunity to study the effects of social media on stock price fluctuations. The dataset is approximately 225 MB in size on GitHub, making it manageable for various analytical tasks, including sentiment analysis, text-based predictive modeling, and financial forecasting. It is structured into two primary components:

  • Tweet Data: This includes the textual content of tweets, user metadata, timestamps, and the companies referenced in each tweet. These features allow researchers to perform sentiment analysis, track user engagement, and examine the frequency of stock-related discussions on social media.

  • Stock Return Data: This includes numerical stock return values corresponding to the companies mentioned in the tweets. The returns are recorded over multiple time intervals, enabling the study of both short-term and long-term price movements in response to social media discussions.

Future of Business - Survey Results

Creators: Facebook; OECD; World Bank
Publication Date: 2018
Creators: Facebook; OECD; World Bank

The Future of Business survey is a collaboration between Facebook, the OECD and the World Bank to provide timely insights on the perceptions, challenges, and outlook of online Small and Medium Enterprises (SMEs). The Future of Business survey was first launched as a monthly survey in 17 countries in February 2016 and expanded to 42 countries in 2018. In 2019, the Future of Business survey increased coverage to 97 countries and moved to a bi-annual cadence.

The target population consists of SMEs that have an active Facebook business Page and include both newer and longer-standing businesses, spanning across a variety of sectors. To date, more than 90 million SMEs have created a Facebook Page, and more than 700,000 of these Facebook Page owners have taken the survey. With more businesses leveraging online tools each day, the survey provides a lens into a new mobilized, digital economy and, in particular, insights on the actors: a relatively unmeasured community worthy of deeper consideration and considerable policy interest. The dataset is approximately 0,04 GB in size.

The survey includes questions about perceptions of current and future economic activity, challenges, business characteristics and strategy. Custom modules include questions related to regulation, access to finance, digital payments, and digital skills.

Relative Wealth Index Data

Creators: Chi, Guanghua; Fang, Han; Chatterjee, Sourav; Blumenstock, Joshua E.
Publication Date: 2021
Creators: Chi, Guanghua; Fang, Han; Chatterjee, Sourav; Blumenstock, Joshua E.
The Relative Wealth Index predicts the relative standard of living within countries using de-identified connectivity data, satellite imagery and other nontraditional data sources.
It has been built by researchers at the University of Carlifornia – Berkeley and Facebook. The estimates are built by applying machine learning algorithms to vast and heterogeneous data from satellites, mobile phone networks, topographic maps, as well as aggregated and de-identified connectivity data from Facebook. They train and calibrate the estimates using nationally-representative household survey 20 data from 56 LMICs, then validate their accuracy using four independent sources of household survey data from 18 countries. They also provide confidence intervals for each micro-estimate to facilitate responsible downstream use.
The data is provided for 93 low and middle-income countries at 2.4km resolution. It covers the time between April 01, 2021 and December 22, 2023. An interactive map of the Relative Wealth Index is available here: http://beta.povertymaps.net/
The combined size of the dataset is approximately 0,08 GB, available in CSV format.
Please cite / attribute any use of this dataset using the following:
Microestimates of wealth for all low- and middle-income countries Guanghua Chi, Han Fang, Sourav Chatterjee, Joshua E. Blumenstock Proceedings of the National Academy of Sciences Jan 2022, 119 (3) e2113658119; DOI: 10.1073/pnas.2113658119 

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.