Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter for users who have profile bio content #308

Closed
ebarry opened this issue Aug 23, 2015 · 33 comments
Closed

filter for users who have profile bio content #308

ebarry opened this issue Aug 23, 2015 · 33 comments
Labels
discussion help wanted requires help by anyone willing to contribute
Milestone

Comments

@ebarry
Copy link
Member

ebarry commented Aug 23, 2015

There should probably be a more comprehensive brainstorm on how to make http://publiclab.org/people something useful for community development. But for now, can admins get a datadump for all usernames where anyone has filled out anything in their profile?

just two columns, username, and text from profile. no images.

@jywarren jywarren changed the title need a datadump or way to browse all profiles where anyone has written anything. filter for users who have profile bio content Aug 24, 2015
@jywarren
Copy link
Member

Thanks -- urgency?

@ebarry
Copy link
Member Author

ebarry commented Aug 24, 2015

not urgent.

@btbonval
Copy link
Member

This will be fantastic for moderators to find spammers we haven't gotten to yet!

I bet there are thousands of such users with spam profiles that aren't yet banned.

@ebarry
Copy link
Member Author

ebarry commented Aug 28, 2015

If this is to be built into the site as a filter for users who have profile bio content, i will also need to be able to search within profile content from the site. That's what i was planning to do with the datadump.

@ebarry ebarry added the outreach issues involve community involvement and helping people who're stuck somewhere label Sep 8, 2015
@ebarry
Copy link
Member Author

ebarry commented Oct 13, 2015

Gently bumping this up. Stevie and I are working on geography this fall (Q4) and as part of that we will be doing targeted geographic outreach to potential organizers. Having this filter will enable us to search for people who mention geographies in their bio, letting us make progress on this in the lagtime before RichProfiles happens.

@btbonval
Copy link
Member

What format do you want this in? CSV will probably not work because profiles can have commas (there are ways around this).

You just want username and profile. Would it suffice to do something like this:

# username

profileprofile

profile

profile

# username

profile profile profile

# username

profile

@ebarry
Copy link
Member Author

ebarry commented Oct 14, 2015

Could you get username, email address, if they've ever posted or commented (or not), and
the full text of what they typed in as their bio?
On Oct 13, 2015 10:07 PM, "Bryan Bonvallet" [email protected]
wrote:

What format do you want this in? CSV will probably not work because
profiles can have commas (there are ways around this).

You just want username and profile. Would it suffice to do something like
this:

username

profileprofile

profile

profile

username

profile profile profile

username

profile


Reply to this email directly or view it on GitHub
#308 (comment).

@btbonval
Copy link
Member

Oh, it looks like I don't have access to the system anymore. I thought I did.

@btbonval
Copy link
Member

woops I have access. Some of my connection configurations are out of date.

@btbonval
Copy link
Member

I think this query should work. I started running it but it appears to take time. Works fine if I restrict it to one or two users specifically.

SELECT users.name, count(node.nid) AS notes, count(comments.cid) AS comments, profile_values.value AS profile
FROM users
INNER JOIN profile_values ON (users.uid = profile_values.uid AND profile_values.fid = 1 AND length(profile_values.value) > 0)
LEFT OUTER JOIN node ON (users.uid = node.uid)
LEFT OUTER JOIN comments ON (users.uid = comments.uid);

That'll get the first profile though, but not later updates. I can't seem to figure out how to get MySQL to get the max fid per user. SELECT uid, max(fid) FROM profile_values GROUP BY uid ignores GROUP BY and simply returns the same max(fid) for every row.

@btbonval
Copy link
Member

neat trick using self outer join to find largest value (newest profile_value in this case)

SELECT pv1.value
FROM profile_values AS pv1
LEFT OUTER JOIN profile_values AS pv2
  ON (pv1.uid = pv2.uid AND pv1.fid < pv2.fid) WHERE pv2.fid IS NULL
ORDER BY pv1.uid;

I forgot to make sure user is not banned, and I screwed up the counts for comments and notes. This looks better, and it seemingly runs faster too.

SELECT users.name, node.notes, comments.comments, profile_values.value AS profile
FROM users
INNER JOIN (
  SELECT pv1.uid, pv1.value
  FROM profile_values AS pv1
  LEFT OUTER JOIN profile_values AS pv2
    ON (pv1.uid = pv2.uid AND pv1.fid < pv2.fid)
  WHERE pv2.fid IS NULL AND length(pv1.value) > 0
) AS profile_values ON (users.uid = profile_values.uid AND users.status > 0)
LEFT OUTER JOIN (
  SELECT uid, count(nid) AS notes
  FROM node
  GROUP BY uid
) AS node ON (users.uid = node.uid)
LEFT OUTER JOIN (
  SELECT uid, count(cid) AS comments
  FROM comments
  GROUP BY uid
) AS comments ON (users.uid = comments.uid);

@btbonval
Copy link
Member

@ebarry okay I ran this and exported. It was too big to share via email so I uploaded to drive and shared on that.

@btbonval btbonval self-assigned this Oct 15, 2015
@btbonval
Copy link
Member

To reiterate my comment earlier:

btbonval commented on Aug 25
This will be fantastic for moderators to find spammers we haven't gotten to yet!

This turns out to be very true.

@ebarry
Copy link
Member Author

ebarry commented Oct 15, 2015

Brian, thank you. This is a treasure trove for place-based organizing.

@btbonval
Copy link
Member

do the thing again, but with more thing!

  1. newer. not the old one, but a new one. like from now. and not the before times.
  2. include user email addresses
  3. possibly include signup date
  4. (possibly include last login. not asked for, but heck)
  5. workaround the spaces.

@btbonval
Copy link
Member

Adds new fields requested but no workarounds for spacing issues.

SELECT users.name, users.mail AS email, users.created AS signup_date, users.access AS last_browse_date, node.notes, comments.comments, profile_values.value AS profile
FROM users
INNER JOIN (
  SELECT pv1.uid, pv1.value
  FROM profile_values AS pv1
  LEFT OUTER JOIN profile_values AS pv2
    ON (pv1.uid = pv2.uid AND pv1.fid < pv2.fid)
  WHERE pv2.fid IS NULL AND length(pv1.value) > 0
) AS profile_values ON (users.uid = profile_values.uid AND users.status > 0)
LEFT OUTER JOIN (
  SELECT uid, count(nid) AS notes
  FROM node
  GROUP BY uid
) AS node ON (users.uid = node.uid)
LEFT OUTER JOIN (
  SELECT uid, count(cid) AS comments
  FROM comments
  GROUP BY uid
) AS comments ON (users.uid = comments.uid);

@btbonval
Copy link
Member

It looks like dates are stored in time since epoch, which is not very useful. In the profie, line feeds and carriage returns need to be replaced with spaces and commas should be replaced with semi-colons or something. This can probably be done as part of the MySQL query.

@btbonval
Copy link
Member

I'm going to skip login times after all. looks like those times are not being updated, so it is probably not worth including. #360

I'll keep signup date though.

@btbonval
Copy link
Member

SELECT users.name, users.mail AS email, FROM_UNIXTIME(users.created) AS signup_date, node.notes, comments.comments,
REPLACE(REPLACE(REPLACE(TRIM(profile_values.value), ",", ";"), "\r", " "), "\n", " ") AS profile
FROM users
INNER JOIN (
  SELECT pv1.uid, pv1.value
  FROM profile_values AS pv1
  LEFT OUTER JOIN profile_values AS pv2
    ON (pv1.uid = pv2.uid AND pv1.fid < pv2.fid)
  WHERE pv2.fid IS NULL AND length(pv1.value) > 0
) AS profile_values ON (users.uid = profile_values.uid AND users.status > 0)
LEFT OUTER JOIN (
  SELECT uid, count(nid) AS notes
  FROM node
  GROUP BY uid
) AS node ON (users.uid = node.uid)
LEFT OUTER JOIN (
  SELECT uid, count(cid) AS comments
  FROM comments
  GROUP BY uid
) AS comments ON (users.uid = comments.uid);

@btbonval
Copy link
Member

oh I guess last time I supplied a tab-separated value sheet so the commas shouldn't have been a problem. There might have been rogue tabs in profiles, but tabs in html editors are far less likely than commas as tab usually changes focus rather than entering a character.

Good thing to keep in mind for next time: TSVs work fine, change the replace above comma-to-semicolon conversion to convert tabs into four spaces and then almost the entire sense is maintained.

@btbonval
Copy link
Member

@ebarry has requested to include wiki edits and comments. I'll have to read the code on how plots2 does this, but I don't think it is something straightforward to do directly in the database.

@ebarry ebarry mentioned this issue Nov 8, 2016
7 tasks
@ebarry ebarry added this to the Spam milestone Feb 21, 2018
@grvsachdeva
Copy link
Member

Hey @ebarry, what goals/issues we can make from here, keeping in view the latest /people page? Thanks!

@grvsachdeva grvsachdeva added help wanted requires help by anyone willing to contribute discussion and removed outreach issues involve community involvement and helping people who're stuck somewhere labels Mar 25, 2019
@ebarry
Copy link
Member Author

ebarry commented Mar 25, 2019 via email

@ebarry
Copy link
Member Author

ebarry commented Mar 25, 2019 via email

@grvsachdeva
Copy link
Member

The essence of this issue is to have profile bio text and tags show up
in Search results.
If this is happening already (?), then we can just close this issue.

We currently show user bio only on /profile page but let's think where else we can add bio field. Here's the search result for warren below 👇

Screenshot from 2019-03-25 23-12-58

If we decide on adding another column for bio in the table, that can make the page a little crowded? What do you think about displaying a popup on hovering over the user image or name? Or, any other idea? That popup can also contain the user profile tags.

@grvsachdeva
Copy link
Member

I suppose there is a second part which is that a Stats function can show
how many people have made which types of contributions. It's a community
awareness / eval function, as well as perhaps a support for spam
moderation. Perhaps we could move over to a Stats issue to describe how
that could work.

Yes @ebarry, that'll be good for the stats page. Should I append this part to stats issue (planning)?

Thanks!

@ebarry
Copy link
Member Author

ebarry commented Mar 25, 2019 via email

@grvsachdeva
Copy link
Member

Hmm, that's interesting!

Currently, we don't search over profile bio, but yes, it would be awesome to see search related profiles. @jywarren should we make improvement in the current API for this or new API?

Great idea Liz!

@jywarren
Copy link
Member

jywarren commented Mar 25, 2019 via email

@grvsachdeva
Copy link
Member

Oh actually yes we do search bio content:
https://github.com/publiclab/plots2/blob/master/app/models/user.rb#L360-L362

😮

@jywarren
Copy link
Member

jywarren commented Mar 25, 2019 via email

@grvsachdeva
Copy link
Member

@jywarren regarding searching user tag. I found this suggestion from GitHub - #3506.

@grvsachdeva
Copy link
Member

I have dropped @ebarry's thought as a comment in Stats planning issue, let take it there #3498 (comment)

And, regarding usertag @jywarren is this - #3506 related?

Thanks and closing this one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion help wanted requires help by anyone willing to contribute
Projects
None yet
Development

No branches or pull requests

4 participants