How to Obtain TCGA DATA
Recommended Post : 【Bioinformatics】 Table of Contents for Bioinformatics Analysis
7. Appendix
1. Common Process
⑴ Search for TCGA on Google
Figure. 1. Common Step 1
⑵ Go to The Cancer Genome Atlas Program - National Cancer Institute at the top
Figure. 2. Common Step 2
⑶ Access the publicly available in the main text
Figure. 3. Common Step 3
⑷ Click on Carts in the upper right corner
Figure. 4. Common Step 4
⑸ Access the GDC Data Transfer Tool on the right side
Figure. 5. Common Step 5
⑹ Install the GDC Data Transfer Tool for your operating system
② Downloading the GDC Data Transfer Tool Client
③ Downloading the GDC Data Transfer Tool User Interface (Beta)
④ Safely download and install both GDC Client and GDC User Interface to the Downloads folder
⑺ Go to https://portal.gdc.cancer.gov/ and access Projects
Figure. 6. Common Step 6
⑻ Select the desired data
Figure. 7. Common Step 7
2. Finishing Method 1
⑴ Click the Manifest button at the top right to download the .txt file
⑵ Run the Command Prompt (cmd) and refer to the location of the text file to enter the following
⑶ You must also enter the command cd Downloads to change the directory
① It is presumed that you need to move to the directory where the gdc-client file is located
② The Manifest file does not necessarily need to be in the same directory
③ If the Manifest file is in the same directory, you can just display the file name and extension
⑷ The following prompt had a minor error in input: gdc-manifest → gdc_manifest
Figure. 8. Finishing Method 1
⑸ The file is saved in ‘C:/Users/sun/’
3. Finishing Method 2
⑴ If you look around, press the Download button to download the .json file
⑵ Open the .json file as a text file and confirm the uuid
⑶ Run the Command Prompt (cmd) and refer to the location of the text file to enter the following
⑷ You must also enter the command cd Downloads to change the directory
① It is presumed that you need to move to the directory where the gdc-client file is located
⑸ gdc-client download uuid
Figure. 9. Finishing Method 2
⑹ The file is saved in ‘C:/Users/sun/’
4. Finishing Method 3
⑴ Open the GDC Data Transfer Tool program on the desktop and drag the Manifest file used in Method 1
Figure. 10. Finishing Method 3
5. Finishing Method 4
⑴ If not confident in the previous methods, use the datasets provided by UCSC XENA
⑵ Link 1. https://ucsc-xena.gitbook.io/project/public-data-we-host/tcga
⑶ Link 2. https://xenabrowser.net/datapages/
6. Troubleshooting
⑴ ERROR: ###: 403 Client Error: FORBIDDEN: { “message”: “Your token is invalid or expired. Please get a new token from GDC Data Portal.” }
① If the Access level is not open but controlled
② Requires a token file equivalent to permission
③ Presumed that the token should be in the same directory as the gdc-client file
7. Appendix
⑴ TCGA Barcodes (ref)
① TCGA-02-0001-01C-01D-0182-01
○ TCGA : Project name
○ 02 : TSS
○ 0001 : Participant
○ 01 : Sample
○ C : Vial
○ 01 : Portion
○ D : Analyte
○ 0182 : Plate
○ 01 : Center
② TCGA-02 : Institution where samples were collected
③ TCGA-02-0001 : Number identifying patients
④ TCGA-02-0001-01 : Sample type of the patient (tumor or normal)
○ 01 : Primary Solid Tumor
○ 02 : Recurrent Solid Tumor
○ 10 : Blood Derived Normal
○ 11 : Solid Tissue Normal
⑤ TCGA-02-0001-01B : Sample piece
⑥ TCGA-02-0001-01B-02 : Sample piece
⑦ TCGA-02-0001-01B-02D-0182 : Plate used for measurement
⑧ TCGA-02-0001-01B-02D-0182-06 : Measurement result considered most accurate after measuring the plate multiple times
○ LAML : Acute Myeloid Leukemia
○ ACC : Adrenocortical carcinoma
○ BLCA : Bladder Urothelial Carcinoma
○ LGG : Brain Lower Grade Glioma
○ BRCA : Breast invasive carcinoma
○ CESC : Cervical squamous cell carcinoma and endocervical adenocarcinoma
○ CHOL : Cholangiocarcinoma
○ LCML : Chronic Myelogenous Leukemia
○ COAD : Colon adenocarcinoma
○ CNTL : Controls
○ ESCA : Esophageal carcinoma
○ FPPP : FFPE Pilot Phase II
○ GBM : Glioblastoma multiforme
○ HNSC : Head and Neck squamous cell carcinoma
○ KICH : Kidney Chromophobe
○ KIRC : Kidney renal clear cell carcinoma
○ KIRP : Kidney renal papillary cell carcinoma
○ LIHC : Liver hepatocellular carcinoma
○ LUAD : Lung adenocarcinoma
○ LUSC : Lung squamous cell carcinoma
○ DLBC : Lymphoid Neoplasm Diffuse Large B-cell Lymphoma
○ MESO : Mesothelioma
○ MISC : Miscellaneous
○ OV : Ovarian serous cystadenocarcinoma
○ PAAD : Pancreatic adenocarcinoma
○ PCPG : Pheochromocytoma and Paraganglioma
○ PRAD : Prostate adenocarcinoma
○ READ : Rectum adenocarcinoma
○ SARC : Sarcoma
○ SKCM : Skin Cutaneous Melanoma
○ STAD : Stomach adenocarcinoma
○ TGCT : Testicular Germ Cell Tumors
○ THYM : Thymoma
○ THCA : Thyroid carcinoma
○ UCS : Uterine Carcinosarcoma
○ UCEC : Uterine Corpus Endometrial Carcinoma
○ UVM : Uveal Melanoma
Input: 2019.08.26 23:32