Computing Resources

Columbia University has a large selection of computing resources available to researchers. For help deciding which option is best suited for you, or for general computing consulting please email iserpsa@columbia.edu to schedule an appointment. (Items noted in red below have an additional cost associated with them, otherwise all resources on this page are available to use at no additional cost for ISERP affiliates who have paid into the computing fee through a grant managed at ISERP)
 

CUIT Shared High Performance Computing:  Columbia's centrally-managed High Performance Computing (HPC) resources on the Morningside Campus offers a Linux-based compute cluster. ISERP, through the SSCC, has purchased a number of nodes and is available to all ISERP affiliates. See this link or email iserpsa@columbia.edu for more information.

CUIT LionMail Drive: When you create new documents using LionMail Drive, you typically create online Google Docs, Spreadsheets and Slides. These programs are collaborative tools that allow users to share files and documents with multiple users. There are also web-based editors to create drawings, forms and fusion tables. These online documents are tightly integrated with other Google Apps and provide very powerful real-time collaboration features. http://cuit.columbia.edu/lionmail-drive. In addition to Google products, you can store other shared or personal files in LionMail Drive. You can even have dropbox-like syncing functionality by installing Drive File Stream. All sensitive & confidential data (such as PII) MUST be encrypted before storing in LionMail drive. 

Box.com account: Columbia has a Business Associate Agreement with Box.com, allowing the storage of columbia research data, including sensitive data. If you are familiar with dropbox, this is a similar offering and files can be shared with others including granular permissions. Data storage is unlimited, and individual files can be up to 5GB in size. Box.com is avaiable on the web, on mobile devices, and has apps to sync files to your Mac or Windows computer. Please fill out this qualtrics form to request a box.com account

Secure Data Enclave: Research Computing Services (RCS) manages a Secure Data Enclave (SDE) intended as a virtual environment used to work with secure data sets. This service is approved by the IRB as a certified server for even the highest security data granting agency requirements. The SDE provides Columbia researchers with a secure, remotely accessible, virtual Windows 10 desktop environment to store and collaboratively analyze PII and PHI data as an alternative to traditional "cold room" computing environments. See more details on the CUIT webpage here.

Secure Data Cold Room: Researchers (including faculty, graduate, and undergraduate students) who need to work with restricted data sets (e.g. IES Database) have access to a cold room to work on this data. A cold room is separate from the master key, has a locked cabinet for storing physical medium (e.g. CD-ROMs), and the computer complies with secure computer requirements. (e.g. no internet, no access to USB drives, password protected, etc). Loading and unloading data must go through a Data Security Officer, currently Eric Vlach at ISERP. Please contact iserpsa@columbia.edu for information about this cold room, and to request access to it. Undergraduates must have a faculty sponsor to act as a Principal Project Officer. 

Columbia Data Platform: (Additional cost, between $5-$150/TB/month depending on data archive tier & data type) Cloud-based solution for research data storage, discovery, analysis, collaboration and archive. Certified for the highest level of security with a CUIT system ID, the Data Platform simplifies storage & analysis of data by integrating software such as python jupyter notebooks; visual data filtering & merging; and more. 

Electronic Lab Notebook - LabArchives: A cloud-based data management software, Electronic lab notebooks are designed to replace paper books, manuals, field research, and data repositories. LabArchives enables collaboration between group members regardless of physical location. If you work with confidential or sensitive data, including personally identifiable data or health information, LabArchives is a great option to look into. As a CUMC security certified system (ID 5644), LabArchives enables central cloud storage and sharing of data of the highest sensitivity. Accounts are provided by the libraries and CUIT at no additional cost - visit www.labarchives.com to get started, select sign in, then Columbia University from the participating institution list. If you'd like to discuss with ISERP if LabArchives could be useful in your research and data collection, reach out to iserpsa@columbia.edu to schedule an appointment. 

Qualtrics: Faculty and Graduate students under ISERP's umbrella can request a qualtrics account under our site license. Qualtrics is a web-based software that allows users to create surveys and generate reports without any previous programming knowledge. This is a powerful tool for collecting data, running experiments, getting feedback, and user polls using a variety of distribution means. Please fill out this form to request an account. 

SFTP Server: Faculty and Graduate students who need an SFTP account for data transfer between parties can request a temporary, 30-day SFTP account with us. SFTP accounts will expire after 30 days, after which all data will be deleted (an extension can be requested.) You will get a server address, username, and password to share with your parties. Servers are protected by strict firewalls, only approved IP addresses can connect. Disk space up to 10GB is automatically approved. Please fill out this form to request a new account or extend an existing one. 

CUIT Virtual Servers: (Additional cost, starting at $127/month for a small server with backup) CUIT offers a managed virtual machine environment with configurable Storage, Compute, Backup and Disaster Recovery options and will assume responsibility for management as outlined in the SLA. Configurations can be customized based on your needs and prices will reflect accordingly. See https://columbia.service-now.com and go to Catalog, go to VirlualServer Hosting.

Amazon Web Services (AWS) through CUIT: (Additional cost, per resource used.) Amazon Web Services offers a broad set of global compute, storage, database, analytics, application, and deployment services that help organizations move faster, lower IT costs, and scale applications. Costs will be comparable to AWS list price, however aggregation of all linked Columbia University AWS accounts under a single University billing account will allow for discounts that will be shared among all customers. Pricing plans begin at 20 GB of storage, for $10 a year. The cost of storage is the number of GB divided by 2 per year. For example, for 100 GB of storage costs $50/year, 200 GB of storage costs $100/year, and so on. Additionally, CUIT will implement an AWS Direct Connect private network peering will lead to reduced data egress rates and some additional technical capabilities. All AWS purchases should be done through CUIT. For more information regarding CUIT and AWS enterprise agreement, email aws-request@columbia.edu. For more information see https://www.amazon.com/clouddrive/learnmore#features-section

 

Data Sets

 

We are pleased to announce that ISERP has joined a partnership to make L2's national voter file available through the Columbia Data Platform (CDP).

To request access please visit: https://redivis.com/CPRC. There are 3 datasets available: L2 Uniform Data, Voter History Data, and Demographics Data.  Currently, access to the data is being limited to research (with IRB approval) and instructional purposes; additional requirements are listed on the CDP website. Please contact Ashley Ortega, assistant research systems engineer at Columbia Population Research Center, for any questions regarding access.

Due to the sensitivity of the data, all data analysis with personal identifiable data must be done within Redivis by creating project spaces where you can use transform nodes and notebook nodes to query and analyze the data.  Exporting data from CDP is restricted to case-by-case approval. At the moment only de-identified data will be considered for export/download approval. 

Newsletter

Don't want to miss our interesting news and updates! Make sure to join our newsletter list.

* indicates required

Contact us

For general questions about ISERP programs, services, and events.

Working Papers Bulletin Sign-up

Sign up here to receive our Working Papers Bulletin, featuring work from researchers across all of the social science departments. To submit your own working paper for our next bulletin, please upload it here, or send it to iserp-communication@columbia.edu.
* indicates required