Import posts from Jetpack/WordPress.com

I’ve just released version 1.8 of both Keyring, and the Keyring Social Importers. This version includes a new service file, and an accompanying importer, which allows you to import content from a Jetpack-powered WordPress site, using the WordPress.com REST API. That means any site hosted on WordPress.com, or any self-hosted site with the Jetpack plugin installed. There are also a few key fixes for the Twitter and LinkedIn services/importers, so it’s a nice update.

The new importer will pull across the entire content of posts, including tags. Similar to the Instapaper importer, it attempts to avoid duplicate content issues by marking pages as noindex if they come from imported content.

This is another piece of the puzzle required for me to create a complete archive of my digital footprints over on Dented Reality, now that I’m blogging here. This post should be imported over there automatically within an hour.

Note that currently the importer doesn’t sideload any media items (will add that soon) or support geo data (again, I’ll add that when I get a chance).

Check it out, and please use responsibly!

Screen Shot 2017-04-15 at 5.03.43 PM

Screen Shot 2017-04-15 at 5.05.32 PM

Recent Social Importer Updates

I’ve been trying to make small improvements to the Keyring Social Importers package (and People & Places) that I maintain, and have made a number of them over the last few weeks. Here are some details of recent updates which you may have missed:

People & Places

  • Improved the labels being used for each taxonomy, so that you don’t get random mentions of “tags” in the WordPress UI.
  • Improved the add_place_to_post() method so that you can add multiple Places to a single Post.
  • Now exposing both the people and places taxonomies via the REST API.

Keyring Social Importers

  • Added a filter so that you can easily (and globally) disable downloading of full content for Instapaper articles.
  • Made it easy to inject custom CSS for a specific importer.
  • Added a Nest Camera service and importer. Including recent updates, it will download a snapshot from the specified camera(s) during the hour indicated, auto-tag it using the location of the camera, and also associate it with a Place if People & Places is co-installed.
  • The Instagram importer now handles video posts properly, and will download the full video and embed it into your posts. Bundled a reprocessor to fix old posts, which would have previously been handled as image posts.
  • Also made the Instagram importer link up People mentioned in captions (not just those who are properly tagged as being in a post).
  • Fixed a bug in the Twitter importer which was mangling newlines. Added a reprocessor to fix it in old posts as well.
  • Now exposing where a post was imported from in the REST API.
  • Added Places support to the TripIt importer, which associates each post with Places for each airport flown through on that trip.

Keyring Social Importers has been updated in the WordPress.org plugin directory (version 1.7, or get it from Github) and you can get the latest version of People & Places from Github (still not an “official” plugin yet).

You can see most of them in action on my site, Dented Reality, which uses them to aggregate most of my online social activity. The People & Places data is not directly exposed yet, but you can see it in the REST API output.

Screen Shot 2017-03-04 at 1.11.22 PM
Places support added to TripIt importer.
Screen Shot 2017-03-04 at 1.13.38 PM.png
Added a Nest Camera importer.
Screen Shot 2017-03-04 at 1.18.56 PM.png
Current list of data reprocessors.

Social Importer Upgrade

Today I pushed some updates to:

  1. People & Places
  2. Keyring Social Importers

These updates make it so that the Twitter, Foursquare and Instagram importers are now dynamically identifying and indexing People and Places, and marking them with a taxonomy within WordPress. I’ve also added a new system for “reprocessing” old posts which Keyring imported, so that you can go back and perform some function on those posts without having to import them again. You’ll find reprocessing tools under Tools > Import > Reprocess Keyring Data.

Screen Shot 2017-01-08 at 9.50.45 PM.png

Reprocessing works by using the locally-stored copy of import data that is saved during the initial import of everything. The system is fully hookable, so you can add other reprocessing routines in via plugin. The core file comes bundled with one that attempts to address an old JSON-data-escaping issue, and I’ve added extensions to the importers listed above which allow you to go back and reprocess your posts for People/Places.

If you’re going to use them, I suggest you run the first one first, then you can run the others in any order you like. Doing the first one first will just make sure that as much of your data as possible is processable.

Screen Shot 2017-01-08 at 9.51.33 PM.png

It’s worth mentioning that if you use these reprocessors, they can take a while (especially if you have a lot of data already), and that they will likely create a lot of new data (in the form of People and Place terms being created in their respective taxonomies). After running all of them over all of my data, I have almost 1,800 People and just over 3,000 Places in my database.

The other tool added in this upgrade is the ability to merge terms, which becomes important with all of this data.

Screen Shot 2017-01-08 at 9.54.49 PM.png

When the importers are dynamically adding People and Places, they only match based on known identifiers. This means that you’re likely to end up with duplicate entries, especially if you’re processing multiple services (e.g. Foursquare and Instagram). Using the merge tool, you can browse through your entries and select 2 or more, then use the Bulk Actions drop-down to select “Merge” and hit “Apply”. Terms will be merged together as intelligently as possible, which basically means that the shortest slug of the group will be kept, and the longest strings for any conflicting fields will be kept. You can of course edit the resulting composite term afterwards and tweak things as you see fit. If you’re looking for a shortcut to identify duplicate entries, try searching for “-2”, which will give you a list of duplicates, then you’ll need to search for something that will bring up each of the dupes, select, merge, repeat. It’s a little bit tedious, but you’ll only need to do it once for each duplicate, and all future imports should match against the composite entry.

Oh, and one last thing — I threw in a quick map on the details page for Places, which provides a nice quick, visual confirmation that it’s the correct location. For now it’s using a very basic OpenStreetMap example, but I might switch it out to Leaflet at some point, which is pretty nice.

Screen Shot 2017-01-08 at 10.00.16 PM.png

People & Places

Over the years, I’ve been working on a system to aggregate data that I publish to other social networks/sites back into my control, on my own WordPress install. Thus far, that has resulted in the creation of Keyring (plugin) to provide an abstracted interface to all of the web services I’m interested in, Keyring Social Importers (plugin) to do the basics of importing the data from different places, and Homeroom (theme) to display it all. Today, I’ve been working on a system that will detect people who are mentioned in an interaction, and link them across posts using a custom taxonomy. It does the same for physical locations, so I’ve called it People & Places.

Essentially, this plugin is just a pair of custom taxonomies, with some specific ways of referring to things. Pretty basic. It gets more interesting though when you update Keyring Social Importers to the trunk version, which will now work in tandem with People & Places to link everything up. I wouldn’t recommend it on a production site just yet — there’s a lot of rough edges still.

When KSI is pulling in content from each service (currently looking at Twitter, Instagram and Foursquare), there’s a new block of code that makes sure People & Places is available, and then looks for certain pieces of data. If it finds them, it bundles up the details, and passes that along in the import process. When posts are actually inserted, it will attempt to link up that post to the People/Places it found. If the People already exist, then they’ll just be linked, in the same way tags work. If they don’t exist yet, then a new Person entry will be created, and that will be used.

I plan to add in a basic term-merging function, so that you can manually (maybe automatically?) identify “duplicate people” across different networks, and intelligently merge their entires (re-linking any posts involved), so that you build up a single, combined view of your interactions with a particular person. I envisage some interesting possibilities with the archive pages for these taxonomies, and that over time it will build a really interesting dataset of your interactions, the places you physically go, etc.

I’ll probably still move the code around a bit, and there are definitely some bugs around duplicates and handling things across different networks, but it seems to be working so far. This is also probably the time to figure out a decent way to allow re-processing of imported data from the raw copy that the importers save in postmeta. Installing this new code will start gathering data on new imported entries, but won’t go back and do the same on all the posts you’ve already got. Rather than deleting all that data and re-importing/processing everything, I’d like to have a simple way to re-process the raw data that’s already stored locally.