Automatically update citation metrics in your CV with a bare minimum script - Part I

The inspiration behind this post comes from my non-computational scientist colleagues who simply wanted to import Google Scholar citation metrics in their CVs and resumes with as little overhead as possible. After a few quick searches, several solutions popped up that used either Markdown or LaTeX. While I appreciate these more sophisticated maneuvers, they did not serve any useful purpose for my specific motivation as described below.

Motivation

There are just a few things my scientist colleagues wanted to achieve:

  1. Ability to edit their CVs and resumes in Microsoft Word,
  2. Without the need to learn a new language or markup syntax, and
  3. Without falling back to a manual update which can be inconvenient.

Fortunately, the R community has already done much of the hard work, thanks to the awesome officer and scholar packages. Briefly, the officer package lets users manipulate Word documents without altering the contents, styles, and properties of the original document, while the scholar package provides handy functions to extract citation data from Google Scholar.

Among other things, I wanted this solution to be a bit more robust without being too complicated for future expansion. In particular, the possibility of adding other elements above and beyond citation metrics such as images, tables, and text from R as well as portability to other documents such as PowerPoint (*.pptx) was on the top of my list. Beyond everything, I wanted to make this exercise as seamless and efficient as possible for my scientist colleagues irrespective of their coding background.

Before getting started, here are a few things we will need: (i) A CV or resume written in Word (.docx), (ii) R, and (iii) RStudio. Information about installing R and RStudio is here (it’s free!). This tutorial also assumes that you have an up-to-date Google Scholar profile. All files described in this tutorial are publicly available.

So, here goes the simple 3-step solution.

Step 1: Customizing the Word Document

For illustration purposes, I will use a template resume named Resume_Master.docx which is mostly empty at this point:

image info

The first step is to add a bookmark where you want to insert your citation metrics. If you are unfamiliar with bookmarks in Word documents, I suggest reading the Microsoft Office tutorial which is pretty concise and hence skipped here. For our specific task, I want to define three bookmarks right after the Publications section as follows:

image info

In particular, I intend to import total citation counts, h-index, and i10 index on these pre-defined bookmarks GS_a, GS_b, and GS_c respectively, as verified by the bookmark window on the right.

Step 2: Running a minimal R script

This step is actually pretty simple as long as you know how to run a script in RStudio. A quick breakdown of what this bare minimum script (Update_Resume.R) does is as follows.

image info

First, it installs and loads the required packages. In the second chunk of the code, it extracts the relevant Google Scholar citation metrics (make sure you have an internet connection while executing this part). Here I have used my own Google Scholar ID twbXG-wAAAAJ (don’t forget to change to yours). Finally, it inserts these metrics into your input Word Document (in this case Resume_Master.docx) and outputs an updated document called Resume_Final.docx. With my own humble contribution to science, the resulting file looks something like this (last retrieved on February 16, 2020):

image info

Please note that, this script assumes that all the associated files are co-located in the same directory. Please make sure to change the names of your input and output files in this script as desired before executing.

The cool thing is that you can seamlessly edit your master file as you would normally do and run this script as frequently as you want to get the citations metrics inserted automatically.

That’s it! You have the exact same resume with the citation metrics inserted.

(Optional) Step 3: Automating with cron

Now, if you would like to further fine-tune this process without the hassle of running the script every other day, you can optionally do that thanks to the amazing cronR package. Fortunately, this step is dead easy as well, especially if you are already familar with R.

In particular, once you are able to install the cronR package following instructions here, you can jump right into the RStudio add-in section of that tutorial.

As an example, if you want to execute the Update_Resume.R script hourly, all you have to do is the following before pressing the Create job button:

image info

Once you schedule the job above, it will update every hour whenever your computer is switched on (you can change the schedule any time). You can also save the resulting file in a shared or publicly accessible folder (such as Dropbox, Google Drive, and OneDrive) and easily embed it on your personal website if you would like.

Happy automating! It’s not that difficult after all :)

Avatar
Himel Mallick, PhD, FASA
Principal Investigator

Applied statistician with broad research interests in biomedical and applied data science, working on problems in machine learning and computational biology.