April 4, 2019

Data Analysis Technical Special: Fixing Package Installation Problems for R and R-studio in OS-X.

What does this mean when I am trying to update R packages in R-Studio?

"xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun?
R-Studio is a popular workspace tool for getting started with R. I have been using R for some years to use the dose-response-curve (drc) package to fit our dose-response data. We study cells from human donor retinas, called endothelial cells. They form the inside of retinal blood vessels. Most new blindness in the United States each year is caused by diabetic retinopathy, which damages blood vessels. 


Wanting to update my own R-programming abilities and finding better ways to teach my own students and staff how to use R, I have been working through "Getting Started with R: An Introduction for Biologists" (2nd Edition, Beckerman, Childs, Petchey). The book makes the excellent point that the first problem in learning R for most students is getting stuck at how to get their data into R. It's not hard, but often the processes are explained poorly in many R-reference books. Unfortunately, while the authors have a good chapter on getting data into R, there is the reality that some of the functions and libraries needed to read in data from .cvs formatted files are in the R-package called readr. R-studio will ask you to let it install readr if it is missing in your library of packages and you can select YES, and Install, but sometimes the process fails. So as a new student of R, you get stuck at .... trying to get your data into R!! How do you fix this problem? Read on...

First note. If you have updated your system (OS-X) to a new name system, you should search for XQuartz and rerun the XQuartz installer. (https://www.xquartz.org/ ) This is the X-windows system that runs on Mac OS-X. Many UNIX based programs use this library to make windows, and XQuartz is not automatically included in newer versions of OS-X. Even if you have it installed already, after updating your system, from Sierra to High Sierra, for example, reinstall the XQuartz.

You can remove R from your system, to make sure old R-framework is removed and gets replaced with a fresh R-installation, although running an R-installer should replace it all properly for you.  Find the version of R to install from the CRAN website at https://cran.r-project.org/.    Pay attention to the instructions on which other UNIX packages you should also update. These can include updated versions of Clang and GFortran (GNU Fortran). They provide links to get those installer packages too. So do a fresh installation of Clang and GFortran. These provide compilers, so when your R-Studio gives you the option of installing the latest "source code" version of a package, you can select that option. 

If you use R-Studio's Install Packages menu to update Readr,  you will see console messages to inform you that a binary-version is available. This is a version already compiled for your Mac from the source-code used the write the Readr component. It may also inform you, in the console window, that the most recent version is available as "source-code", which must to be compiled by your Mac into binary code during installation. You will be asked if you want to try updating using this source version (Y or N). Typing N (no) will have R-studio try to download the binary version, but you may get a message that a function associated is missing and update is aborted. Or you may find that functions that are part of the Readr package still fail to run. If, on the other hand you select Y (yes) to install from the source-code version, you may get an error about something called xcrun:

xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun

That indicates that your Mac fails to find xcrun functions that provide your mac with the programs to compile the source-code. So your package install again fails. The older binary-version may not up to date enough to fix your functions, but your Mac is missing the underlying X-tools required to install the more up to date Source-code version. This means you may still need to intall X-tools, or Xcode. 

Xcrun is part of X-tools, which most of us have not installed unless we selected it during new system installation. My solution to enable me to install the lastest versions of packages like Readr was to also make sure I had a fresh install of Xcode for my current system. Here is how to do that:

Making sure you have Xcode too:

Open the Terminal application (Terminal.app) which resides in your Utilities folder. Use the GO menu in your Finder to open the Utilities folder. 

You will run the following command line function to install what you need. You have to be connected to the internet of course. Type in:

xcode-select --install

Hit enter, then your Mac will connect to Apple and get the Xcode installer. That installer will launch automatically, and accept the license and let it continue to install Xcode for you.

SO NOW, IF YOU ARE READY....

Check List:

  1. Updated your XQuartz install. ___
  2. Fresh install of Gfortran. ___
  3. Fresh install of Clang. ___
  4. Install of Xcode. ___
  5. Fresh install of R. ___
  6. Fresh install of R-studio. ___


Now, hopefully, you can again update R-packages in your R-studio. When asked if you want the latest versions that are only in source-code, you can answer Y (yes) and sit back and watch all the activity in your R-studio console. Now your Mac will install the source-code versions by finding xcrun functions ok and completing your package updates. 

This solved my problems of updating packages that obviously had bugs and had to be updated to use the functions of two packages in particular: the readand dplyr library packages. This was important, because without them working properly there is no way to use all that excellent information in "Getting Started with R: An Introduction for Biologists."

In R-Studio, I recommend using the Package Install or Update menu options to update the following three R packages before delving into any guides that will be introducing you to exercises on getting your data from .csv formatted files (comma delimited text files) into R:

1.   readr
2.   dpylr
3.   ggplot2

Readr, is the home of the read.csv() function to read your data into R. Dpylr, is the home of the glimpse() function for checking the composition and variable characteristics of your loaded data.  Ggplot2, is the home of many updated functions used in plotting your data.

Have fun analyzing, plotting and doing statistics with your data for FREE, if you get yourself into R. 

Ken Mitton
Associate Professor of Biomedical Sciences
Eye Research Institute
Oakland University
Rochester Michigan







No comments: