Category Archives: Data analysis and reports

Large-scale data exploration with R

“You are provided a selection of norms for English and German across a range of variables. The
norms rely on human judgements and/or semi-automatic extensions regarding degrees of concreteness, valence, arousal, imageability and further perception modalities. In addition, you are provided corpus-based frequency lists as well as distributional co-occurrence scores.
The goal of your project is to first analyse a subset of the norm data and then to explore whether
judgements are related across modalities and to corpus-based frequency and semantic diversity.

Task: Write a report about your findings (Your report should be 5 – 8 pages long (excluding the bibliography))”

(Unfortunately, the corpus frequency and distributional information files are too large to be uploaded here. I hope there’s another way to provide the files in case this assignment is being done by someone. It is a beginners R course, so only basic plots and statistics should be in the report, you can still freely choose the variables though. Basically have some fun looking into data!)

Digital Analytics – Individual Research Report

What you need to do:
Part I Data collection
Write a Python script that harvests the tweets of the three Twitter accounts the study focuses on. Get the contents of their tweets and when they were tweeted. Write the information into a data file (an xlsx file, not a CSV file see the documentation below). You will need to upload both the Python script and the data file that you generated as part of the assignment.

Part II Data analysis
Analyse the file that is provided, named twitterdata.xlsx. The analysis will a descriptive analysis of the tweets, in which you will compare how the three accounts in focus have tweeted (and how that possibly changed over time). This will require to draw a random sample of 50 COVID tweets per account (See the additional documentation on how to do that).
Be creative in how you handle the analysis. You will have to upload the Python script in which you perform the analysis (data cleaning if necessary and analysis/visualisation).

Part III Report
Write a research report (1,500 to 2,000 words, all included), with the following sections:
1. Introduction section in which you contextualize the research and outline the research questions.
2. Methodology section in which you concisely explain the procedure: (1) how did you get the data (Although you will work with the data that I provide, the procedure should be just the same as the one that you used, only the timeframe of data collection is wider started December 1st and lasts until April 18th), (2) what do the sample data look like (i.e., how many tweets, harvested in what period this will be a description of the file that is available on Blackboard, not the data file you harvested yourself).
3. Results section in which you discuss the analysis: i.e., what did you do with what data, and what does that tell us.
4. Discussion section in which you explain how the results answer the initial research questions (i.e., what do the results mean). This is concluded by a reflection on the strengths and weaknesses of the research methodology (draw inspiration from the introduction lecture, as well as from the module on APIs).
Make sure that the report mentions your name and student number. There are no strict guidelines on how to format the document, except for the word count. However, make it look clean and professional in every possible way. A professionally type-set research article by a publisher such as Sage, Wiley-Blackwell, Elsevier might inspire you.

In total, there are five files you need to upload, combined in a single compressed .zip file:

1. A python file that harvests tweets and writes them into a data file (.py file)
2. The data file with the harvested tweets (.xlsx file)
3. A python file with the data processing/analysis/visualisation (.py file)
4. The final version of the data file that you processed
5. A text document with the 1,500 to 2,000-word research report (.pdf)
Your project makes up 50% of your final grade.

    What are you graded on?
1. Were you able to outline the relevance of the research question? (introduction section report)
2. Is the code that you wrote to harvest tweets valid and effective? (harvest file)
3. Were you able to clearly describe the procedure on how tweets were harvested?
(methodology section report)
4. Were you able to transparently explain what you did with the data, what you
analysed/visualised? (results section report)
5. Were you able to clean and format the given research data? (analysis file)
6. Is the analysis/visualisation that you performed sound/valid? (analysis file)
7. Does your discussion of the results make sense in answering the research
questions? Are you able to pinpoint the strengths and weaknesses of the method (Including whether analysing tweets is the right way to go…)? (discussion section report)
8. Is your writing tidy and clear? (entire report)
9. Is your document professionally formatted? (entire report)

Office Supply Store Data Analysis

An office supply store tests a telemarketing campaign to its existing business customers. The company targeted approximately 16,000 customers for the campaign. Assume you are a consultant brought on board to help the company leverage and use the findings from the tests to its advantage. Refer to the accompanying spreadsheet, which contain the results of the tests.
The detailed requirements and expected deliverables are mentioned in Capstone Assignment.docx.
Three sample presentations are attached for reference. Data to be used are in excel file.