Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brian Newsom Week 7 #9

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 23 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,70 +1,75 @@
# Name

write-your-name
Brian Newsom

# How many points have you earned?

0/100
100/100

(Make your own calculation and replace the number 0 with the points you think you've earned.)

# How many hours have you spent on this?

fill-in-your-answer
3-4

# What is the most difficult part about this week's challenge?

fill-in-your-answer
Figuring out a methodology to discern the effect of the climate march, then enacting it in Tableau - I still struggle with Tableau quite a bit, it is definitely non-trivial still.

# Show and tell (10 points)

## Link (2 points)

[title-of-the-article](http://link-to-an-article-using_data_to_solve_a_social_problem)
[Ushahidi: Machine Learning for Human Rights](http://dssg.io/2013/07/15/ushahidi-machine-learning-for-human-rights.html)

## Explain why you found the project interesting. (8 points)

fill-in-your-answer
This project regards machine learning and text analysis of text messages, tweets, and the like. It is very easy to understand a poorly written text as a human, but teaching a computer to do the same is non-trivial. I thought it was really interesting to hear that the people were previously hand tagging each entry, the sheer amount of man power required for this surprises me that it'd be worth it in order to learn info and respond quickly to situations. The article provides interesting insight into the shaping of the problem, which narrowed the problem into detecting a language, guessing a category, extracting locations, and flagging duplicates. Within the limited scope the project seemed to be quite successful, which is awesome.

# GDELT (I) (5 points x 6 + 20 points x 3 = 90 points)

## Checkpoints

### 1 (5 points)

![image](image.png?raw=true)
![image](cp1.png?raw=true)

### 2 What types of questions do you think this database could provide insight into? (5 points)

fill-in-your-answer

This is a really interesting and massive dataset! From what I can discern from the documentation, one of the most interesting insights it could provide is how events effect each other. For example, how an Al Qaeda leader announcing something may affect the President's contributions or the number of events that come from political leaders. There is tons of data, and it could be applied in countless ways to learn interesting things. It seems to be primarily political and social in nature - which covers so many areas of knowledge.
### 3 (5 points)

![image](image.png?raw=true)
![image](cp3.png?raw=true)

### 4 (5 points)

![image](image.png?raw=true)
![image](cp4.png?raw=true)

### 5a (5 points)

![image](image.png?raw=true)
![image](cp5.png?raw=true)

### 5b Do you have any thoughts on why these events are missing geographic information? (5 points)

fill-in-your-answer
The top few events are as follows
```
010 Make statement, not specified below
020 Appeal, not specified below
042 Make a visit
040 Consult, not specified below
```
What we see is a pattern of vagueness, which makes sense with null locations. Since there is not even info on what the comment is, it is also likely the data has unknown location or the location was not added because there was simply not enough information. It is worth noting, however, that these may just be the most common occurences because of their vagueness.

## Challenges

### 1 (20 points)
{{one-paragraph-discussion-of-your-answer}}
Proving impact is a difficult task, especially because correlation does not imply causation. Since I am not incredibly familiar with the data set to this point, I believe the best way to prove that the climate summit made an impact is to examine the response by environmentally concerned political leaders around the world following the march - trying to discover if they made a commitment to changing policies as a result of the march. With GDELT data, this could be done by using the data prior to and in the days following the climate summit, and looking at political leaders response to see if there is a significant change in the awareness or response (Goldstein scale) that can be attributed to the march.

### 2 (20 points)

![image](image.png?raw=true)
![image](ch2.png?raw=true)

### 3 (20 points)

![image](image.png?raw=true)
![image](ch3.png?raw=true)

{{one-paragraph-discussion-of-your-answer}}
I plotted the actor's who have environmental codes' events prior to (RED) during (YELLOW) and after (GREEN) the march, sized by their Goldstein Scale impact. My theory is rather hard to discern from this data set. There seems to be more positive environmental events occuring after the march, but not significantly so. It seems that the march did not immediately encourage some politicians to speak about the environment or try to pass new bills or anything but this makes sense. I think an issue with my technique was the limited time frame, as most things in politics are done slowly, so examining over 5 days was probably necessarily. Also, this experiment depends upon GDELT's categorization of events, and all events counted must have been tagged environmental (which may even be too broad). All said, I think my experiment provides interesting insight, but is not ultimately conclusive as to analyzing the impact.
Binary file added ch2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added ch3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
37 changes: 37 additions & 0 deletions concat.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
// Challenge Week 7
// Brian Newsom
// Script to concatenate files using nodejs
var fs = require('fs');

function append(file,toAppend){
fs.readFile(toAppend, function read(err,data) {
if (err) {
throw err;
}
console.log("File Read successful");
fs.appendFile(file, data, function (err) {
if (err) {
throw err;
}
console.log('Data appended to file');
});
});
}

// Async is okay here because order doesn't matter, we just need the data

file = './relevantData/headers.txt';
toAppend = './relevantData/20140921.export.CSV';
append(file,toAppend);

toAppend = './relevantData/20140922.export.CSV';
append(file,toAppend);

toAppend = './relevantData/20140923.export.CSV';
append(file,toAppend);

toAppend = './relevantData/20140924.export.CSV';
append(file,toAppend);

toAppend = './relevantData/20140925.export.CSV';
append(file,toAppend);
Binary file added cp1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added cp3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added cp4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added cp5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.