Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(Twitter): urls are now formatted with the std https://twitter.com… #3925

Merged
merged 11 commits into from
Apr 21, 2020
44 changes: 44 additions & 0 deletions migrations/20200409232153_rewrite_twitter_urls.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
<?php

use Phinx\Migration\AbstractMigration;

class RewriteTwitterUrls extends AbstractMigration
{
/**
* Change Method.
*
* Write your reversible migrations using this method.
*
* More information on writing migrations is available here:
* http://docs.phinx.org/en/latest/migrations.html#the-abstractmigration-class
*
* The following commands can be used in this method and Phinx will
* automatically reverse them when rolling back:
*
* createTable
* renameTable
* addColumn
* renameColumn
* addIndex
* addForeignKey
*
* Remember to call "create()" or "update()" and NOT "save()" when working
* with the Table class.
*/
public function up()
{
$sql = "UPDATE messages
Copy link
Contributor Author

@rowasc rowasc Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to do replace rather than set because I wanted to play it safe in case there is some scenario where message doesn't conform with just having a single tweet as the content and has other content that we shouldn't just erase with a SET
Light concern: this change should be fine, but it may be pretty heavy in some deployments... like my ridiculous testing one with like 300k tweeets ...

INNER JOIN contacts on contacts.id = messages.contact_id
SET messages.message = REPLACE(messages.message, concat('https://twitter.com/statuses/', messages.data_source_message_id), concat('https://twitter.com/', contacts.contact, '/status/', messages.data_source_message_id))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this couldn't be done, because when I was checking a database with tweet messages, I couldn't see the tweet user handles anywhere. Turns out that we are storing the numeric ID for the user.

It also turns out that twitter at this point in time now will accept that numeric ID in the tweet URL (i.e. https://twitter.com/:userNumId/status/:tweetId ) and redirect to the URL with the user handle.

So that's good now, it makes me queasy though 🤷 Should we start storing the screen name, should we start storing both? Big questions for a Friday

WHERE `messages`.`type` = 'twitter'";
$this->execute($sql);
}

public function down() {
$sql = "UPDATE messages
INNER JOIN contacts on contacts.id = messages.contact_id
SET messages.message = REPLACE(messages.message, concat('https://twitter.com/', contacts.contact, '/status/', messages.data_source_message_id), concat('https://twitter.com/statuses/', messages.data_source_message_id))
WHERE `messages`.`type` = 'twitter'";
$this->execute($sql);
}
}
11 changes: 9 additions & 2 deletions src/App/DataSource/Twitter/Twitter.php
Original file line number Diff line number Diff line change
Expand Up @@ -187,13 +187,20 @@ public function fetch($limit = false)
) {
continue;
}
$user_id = $user['id_str'];
// @todo Check for similar messages in the database before saving
$messages[] = [
'type' => MessageType::TWITTER,
'contact_type' => Contact::TWITTER,
'from' => $user['id_str'],
'from' => $user_id,
'to' => null,
'message' => 'https://twitter.com/statuses/' . $id,
/**
* Best compromise I could find was just make the proper urls with user_id rather than only
* tweet id (for which there is an unofficial formula)... since there doesn't seem to be a way to grab the URL from the
* API itself in the v1.1 search endpoint.
* Fun fact: if the user id is wrong, twitter still takes you to the correct Tweet... they just use the tweet id
**/
'message' => "https://twitter.com/$user_id/status/$id",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't find a way to get twitter urls... so I'm using this more obvious user facing URL instead of /statuses... which I suppose gives me more peace of mind... but still, would prefer if twitter actually gave us the url to use, you know?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd so much love that too!

'title' => 'From twitter on ' . $date,
'datetime' => $date,
'data_source_message_id' => $id,
Expand Down
6 changes: 3 additions & 3 deletions tests/unit/App/DataSource/TwitterDataSourceTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ function ($a, $b, $c, $d) use ($mockTwitterOAuth) {
'type' => 'twitter',
'contact_type' => 'twitter',
'from' => '1112222223', // twitter user id
'message' => 'https://twitter.com/statuses/abc123',
'message' => 'https://twitter.com/1112222223/status/abc123',
'to' => null,
'title' => 'From twitter on Thu Apr 06 15:24:15 +0000 2017',
'data_source_message_id' => 'abc123',
Expand All @@ -282,7 +282,7 @@ function ($a, $b, $c, $d) use ($mockTwitterOAuth) {
'type' => 'twitter',
'contact_type' => 'twitter',
'from' => '1112222225', // twitter user id
'message' => 'https://twitter.com/statuses/abc125',
'message' => 'https://twitter.com/1112222225/status/abc125',
'to' => null,
'title' => 'From twitter on Thu Apr 06 15:24:15 +0000 2017',
'data_source_message_id' => 'abc125',
Expand All @@ -293,7 +293,7 @@ function ($a, $b, $c, $d) use ($mockTwitterOAuth) {
'type' => 'twitter',
'contact_type' => 'twitter',
'from' => '12344494949', //twitter user id
'message' => 'https://twitter.com/statuses/abc126',
'message' => 'https://twitter.com/12344494949/status/abc126',
'to' => null,
'title' => 'From twitter on Thu Apr 06 15:24:15 +0000 2017',
'data_source_message_id' => 'abc126',
Expand Down