What level of Open Access scholar are you?

Today is a feast for Open Access fans at Impactstory!

Your scholarship is more valuable when it’s available to everyone: free to be widely read, discussed, and used.  Realizing this, funders increasingly mandate that articles be made freely available, and OA journals and repositories make it increasingly easy.

And today at Impactstory, we make it visible!

Where your articles have free fulltext available somewhere online, your Impactstory profile now links straight to it (we’ve found many of these automatically, but you can add links manually, too). Now along with seeing the impacts of your work, folks checking out your profile can read the papers themselves.

But openness is more than just a handy bonus: it’s an essential qualification for a modern scholar. That’s why there’s growing interest in finding good ways to report on scholars’ openness–and it’s why we’re proud to be rolling out new Open Access awards. If 10% of your articles are OA (gold or green), you get an Open Access badge at the top of your profile. For the more dedicated, there are Bronze (30% OA) and Silver (50%) award levels. The elite OA vanguard with over 80% OA articles get the coveted Gold-level award. So…which award did you get? How open are you? Check Impactstory to find out!

To celebrate the launch, we’re giving away this awesome “i ♥ OA” tshirtfeaturing the now-classic OA icon and our new logo, to one randomly-drawn Bronze or higher level OA scholar on Monday.

Don’t have a Bronze level award yet? Want to see some more of those “unlocked” icons on your profile?  Great! Just start uploading those preprints to get improve your OA level, and get your chance for that t-shirt. 🙂

Finally, we’ve saved the most exciting Impactstory OA news for last: we’ll also be sending one of these new t-shirts to Heather Joseph, Executive Director of SPARC.  Why? Well, partly because she is and has been one of the OA movement’s most passionate, strategic, and effective leaders. But, more to the point, because we’re absolutely thrilled to be welcoming Heather to Impactstory’s Board of Directors.  Heather joins John Wilbanks on our board, filling the vacancy left by Cameron Neylon as his term ends.  Welcome Heather!

Bringing article fulltext to your Impactstory profile

Your work’s impact helps define your identity as a scientist; that’s why we’re so excited about being the world’s most complete impact profile. But of course the content of your work is a key part of your identity, too.

This week, we’re launching a feature that’ll bring that content to your Impactstory profile: if there’s a free fulltext version of one of your articles, we’ll find it and automatically link to it from your profile.

We’ll be automatically checking tons of places to find where an article’s freely available:

  • Is the article in PMC?
  • Is it published in a journal listed in the Directory of Open Access Journals?
  • Is it published in a journal considered “open access” by Mendeley?
  • Does PubMed LinkOut to a free and full-text resource version?
  • If it’s in none of these, is it in our custom-built list of other open sources (including arXiv, figshare, and others)?

Of course, even with all these checks, we’re going to miss some sources–especially self-archiving in institutional repositories. So we’ll be improving our list over time, and you’ll be able to easily add your own linkouts when we miss them.

We’re excited to pull all this together; it’s another big step toward making your Impactstory profile a great place to share your scholarly identity online. Look for the release in a few days!

Who’s the tweetedest?

Formal citations are important, but it’s the informal interactions that really power the scientific conversation. Impactstory helps our users observe these. And since Monday, they’ve been able to observe them a lot more clearly: adding Twitter data from Altmetric.com has significantly improved our coverage, to the point where we’re confident saying Impactstory is most comprehensive source of scholar-level Twitter data in the world.

We wanted to play with all this data a little, so we thought it’d be fun to find the three most tweeted scholars on Impactstory.  Congrats to Ethan White, Ruibang Luo, and Brian Nosek: y’all are the Most Tweeted, with nearly 1000 tweets each mentioning your research papers, preprints, and datasets!

But of course, while these numbers are impressive they’re far from the whole story. By diving into the content of individual tweets, we can learn a lot more.

For instance, Ethan posted a grant proposal on figshare. This isn’t a traditional paper; it’s not even cited (yet). It’s not helping Ethan’s h-index. But it is making an impact, and looking at Twitter can help us see how. Zooming in, we can find this take from @ATredennick, a PhD candidate in ecology at Colorado State:

Thanks @ethanwhite for posting successful NSF proposal, http://bit.ly/MeKXsP . Very useful for early-career scientists.

That’s one tweet; there are 53 others for this product. Now we’re looking beyond simple counts and starting to tell data-driven stories–stories we’d never see otherwise.

Right now we’re only linking to a subset of tweets for each product, but we’re working to add the ability to see all of ‘em. We’re also going to be bringing data about tweet locations and authors (are you being tweeted by a fellow scientist? a blogger? your labmates?) right into your profile. If you’ve got other ideas for Twitter features, let us know!

In the meantime: congrats again to Brian, Ruibang, and Ethan! We’ll be sending them each a swag bag with an Impactstory “I am more than my h-index” tshirt, and stickers featuring our new logo.

Want to find who’s tweeting your science? Make your profile to find out!

Topsy ending data access

Last month, the Twitter data provider Topsy was acquired by Apple. No one seems real clear on what Apple tends to do with their new acquisition, but we can tell you what they won’t be doing: continuing to provider our Twitter data. They’ve informed us this service is being turned off early next month.

Thankfully, we’d already started looking into switching to Altmetric.com as our Twitter data provider. Not only are they still, you know, in business–they also offer significantly improved coverage of most research products.

However, Topsy’s exit does have implications for you, our users. First, although our twitter tracking for scholarly articles, preprints and datasets has improved thanks to Altmetric.com, we’re losing our ability to track tweets on other kinds of products like github and slideshare. Second, we need to disable our Twitter and WordPress Blog products: they relied heavily on Topsy data. Tweets and blog posts will stop displaying on profiles in the next few days.

We’re disappointed about losing these features. We know you loved those features and we did too.  As many folks have pointed out, one of the key challenges of altmetrics is securing persistent, open access to data (the same is true, for that matter, of bibliometrics in general). So we’ve planned for this sort of thing, but it’s still no fun.

The good news is that we’re still committed to these features, especially getting great impact metrics for users’ blogs and Twitter feeds. We’re looking into several replacement approaches now, and we’re optimistic. A lot depends on how much demand we get, so we can decide where to prioritize these. As always, if it matter to you, let us know; we’ll listen.

New Impactstory Logo

new-impactstory-logo-no-typeWe’re excited to announce a new logo–and a chance to win a free shirt!

The new logo reflects our focus on building great impact profiles for individual scientists: the “i” in the middle stands for Impactstory of course, but it’s also the first-person pronoun. Your Impactstory profile is about you. We’re building something to represent you, the working scientist, better than anything else out there. It’s a place for information (hence the i-with-a-circle-around it iconography), but also identity.

Identity in science is pretty broken. In several ways, but one of the biggest is our growing reliance on one-dimensional, reductive currencies like the h-index and Impact Factor. We’re fixing that. Impactstory’s a place where you can tell your whole impact story, where your identity is more than a number. You are more than your h-index. We’ll be focusing hard on this message this year.

Finally, we love our new logo because it anticipates important upcoming features and product focus (Spoiler Alert!). We’re going to be adding a growing number of features that recognize scientific excellence along multiple dimensions, highlighting areas where our users are winning–the badge-esque scallops on the logo reflect these upcoming features.

To celebrate our new logo, we’re going to send out a cool new “I am more than my h-index” tshirt to a lucky Impactstory user — we’ll do a random drawing Friday of everybody who visits their profile this week.  We’re also happy to send some brand new stickers to anyone who wants them… drop us a line at team@impactstory.org and we’ll get some out to you.

Link your figshare and ImpactStory accounts

We’re big fans of figshare at ImpactStory: it’s one of a growing number of great ways to get research data into the open, where others can build on it.

So we’re excited today to announce figshare account integration in ImpactStory! All you have to do is paste in a figshare account URL; then, in the background, we gather your figshare datasets and report their views, downloads, tweets, and more.

The best part is that you’ll see not just numbers, but your relative impacts compared to the rest of figshare. For instance, here’s a figshare product with 40 views, putting it in at least the 67th percentile compared to other figshare datasets that year.  Here’s an even better one: not only is it in the 97th percentile of views, it’s also been downloaded and tweeted.

If you’ve already got an ImpactStory profile, just click “import products” to add your figshare account (you can also still paste individual DOI’s in the “Dataset DOIs” importer). If you don’t have an ImpactStory account yet, now’s a great time to make one–you can be checking out your figshare impacts in less than five minutes.

figshare’s tagline encourages you to “get credit for all your research.” We think that’s a great idea, and we’re excited about making it easier with ImpactStory.

ImpactStory awarded $300k NSF grant!

ImageWe’re thrilled to announce that we’ve been awarded a $297,500 EAGER grant from the National Science Foundation to study how automatically-gathered impact metrics can improve the reuse of research software. The grant (posted in its entirety on figshare) has three main components:

First, we’ll improve ImpactStory’s ability to track and display the impact of research software. We’ll build tools to uncover where and how software is downloaded, installed, extended, and used; we’ll also mine the research literature to find how software is being reused to make new studies possible. We’ll present all this impact information in an easy-to-understand dashboard that researchers can share.

Second, we’ll be using quantitative and qualitative approaches to see if this impact data helps promote actual software reuse among researchers. We’ll gather data for a sample of software projects, survey researchers, and track inclusion of impact data in grant, tenure, and promotion materials.

Finally, we’ll work to build an engaged community of researchers to help support the project, starting with a group of ImpactStory Software Impact Advisors; these folks will help us with feedback and ideas, and also let us know when and how they’re using software impact metrics in their own professional practice.

The long-term goal of the project is big: we want to transform the way the research community values software products. This is in turn just one part in the larger transformation of scholarly communication, from a paper-native system to a web-native one.

Of course we’re not going to achieve all that in a two-year grant. But we do think we can offer key support to this revolution in the making, and we can’t wait to get started. Thanks, NSF; it’s going to be an exciting two years!

A new framework for altmetrics

At total-impact, we love data. So we get a lot of it, and we show a lot of it, like this:


There’s plenty of data here. But we’re missing another thing we love: stories supported by data. The Wall Of Numbers approach tells much, but reveals little.

One way to fix this is to Use Math to condense all of this information into just one, easy-to-understand number. Although this approach has been popular, we think it’s a huge mistake. We are not in the business of assigning relative values to different metrics; the whole point of altmetrics is that depending on the story you’re interested in, they’re all valuable.

So we (and from what they tell us, our users) just want to make those stories more obvious—to connect the metrics with the story they tell. To do that,  we suggest categorizing metrics along two axis: engagement type and audience. This gives us a handy little table:

Now we can make way more sense of the metrics we’re seeing. “I’m being discussed by the public” means a lot more than “I seem to have many blogs, some twitter, and ton of Facebook likes.” We can still show all the data (yay!) in each cell—but we can also present context that gives it meaning.

Of course, that context is always going to involve an element of subjectivity. I’m sure some people will disagree about elements of this table. We categorized tweets as public, but some tweets are certainly from scholars. Sometimes scholars download html, and sometimes the public downloads PDFs.

Those are good points, and there are plenty more. We’re excited to hear them, and we’re excited to modify this based on user feedback. But we’re also excited about the power of this framework to help people understand and engage with metrics. We think it’ll be essential as we grow altmetrics from a source of numbers into a source of data-supported stories that inform real decisions.

Choosing reference sets: good compared to what?

In the previous post we assumed we had a list of 100 papers to use as baseline for our percentile calculations. But what papers should be on this list? 

It matters: not to brag, but I’m probably a 90th-percentile chess player compared to a reference set of 3rd-graders. The news isn’t so good when I’m compared to a reference set of Grandmasters. This is a really important point about percentiles: they’re sensitive to the reference set we pick.

The best reference set to pick depends on the situation, and the story we’re trying to tell. Because of this, in the future we’d like to make the choice for total-impact reference sets very flexible, allowing users to define custom reference sets based on query terms, doi lists, and so on.

For now, though, we’ll start simply, with just a few standard reference sets to get going.  Standard reference sets should be:

  • meaningful
  • easily interpreted
  • not too high impact nor too low impact, so gradations in impact are apparent
  • applicable to a wide variety of papers
  • amenable to large-scale collection
  • available as a random sample if large

For practical reasons we focus first on the last three points.  Total-impact needs to collect reference samples through automated queries.  This will be easy for the diverse products we track: for Dryad datasets we’ll use other Dryad datasets, for GitHub code repositories we’ll use other GitHub repos.  But what about for articles?  

Unfortunately, few open scholarly indexes allow queries by scholarly discipline or keywords… with one stellar exception.  PubMed.  If only all of research had a PubMed!  PubMed’s eUtils API lets us query by MeSH indexing term, journal title, funder name, all sorts of things.  It returns a list of PMIDs that match our queries.  The api doesn’t return a random sample, but we can fix that (code).  We’ll build ourselves a random reference set for each publishing year, so a paper published in 2007 would be compared to other papers published in 2007.

What specific PubMed query should we use to derive our article reference set?  After thinking hard about the first three points above and doing some experimentation, we’ve got a few top choices:

  • any article in PubMed
  • articles resulting from NIH-funded research, or
  • articles published in Nature,

All of these are broad, so they are roughly applicable to a wide variety of papers.  Even more importantly, people have a good sense for what they represent — knowing that a metric is in the Xth percentile of NIH-funded research (or Nature, or PubMed) is a meaningful statistic.  

There is of course one huge downside to PubMed-inspired reference sets: they focus on a single domain.  Biomedicine is a huge and important domain, so that’s good, but leaving out other domains is unhappy.  We’ll definitely be keeping an eye on other solutions to derive easy reference sets (a PubMed for all of Science?  An open social science API?  Or hopefully Mendeley will include query by subdiscipline in its api soon?).  

Similarly, Nature examines only on a single publisher—and one that’s hardly representative of all publishing. As such, it may feel a bit arbitrary.

Right now, we’re leaning toward using NIH-funded papers as our default reference set, but we’d love to hear your feedback. What do you think is the most meaningful baseline for altmetrics percentile calculations?

(This is part 5 of a series on how total-impact will give context to the altmetrics we report.)

Percentiles, a test-drive

Let’s take the definitions from our last post for a test drive on tweeted percentiles for a hypothetical set of 100 papers, presented here in order of increasing readership with our assigned percentile ranges:

  • 10 papers have 0 tweets (0-9th percentile)
  • 40 papers have 1 tweet (10-49th)
  • 10 papers have 2 tweets (50-59th)
  • 20 papers have 5 tweets (60-79th)
  • 1 paper has 9 tweets: (80th)
  • 18 papers have 10 tweets (81-98th)
  • 1 paper has 42 tweets (99th)

If someone came to us with a new paper that had 0 tweets, given the sample described above we would assign it to the 0-9th percentile (using a range rather than a single number because we roll like that).  A new paper with 1 tweet would be in the 10th-49th percentile.  A new paper with 9 tweets is easy: 80th percentile.

If we got a paper with 4 tweets we’d see it’s between the datapoints in our reference sample — the 59th and 60th percentiles — so we’d round down and report it as 59th percentile.  If someone arrives with a paper that has more tweets than anything in our collected reference sample we’d give it a 100th percentile.

Does this map to what you’d expect?  Our goal is to communicate accurate data as simply and intuitively as possible.  Let us know what you think!  @totalimpactorg on twitter, or team@total-impact.org.

(part 4 of a series on how total-impact plans to give context to the altmetrics it reports. see part 1part 2, and part 3.)