Study Abroad Acceptance Essay
Recommendation: Study Abroad Admission and Scholarship Overview for 2023-2024
1. Analysis of Study Abroad Admission Cases
2. Study Abroad Acceptance Essay
3. Future Plans
a. Criteria for September 2024 Admission, US Graduate Programs
b. Study Abroad Admission Failure and a New Challenge
1. Analysis of Study Abroad Admission Cases
⑴ Stanford Bioengineering (■ biochem; Spatiotemporal Omics)
○ Motivation for Application: Discovered the department while seeking alternatives to biomedical informatics, biophysics, which rarely admit foreigners, or slightly less fitting CS programs.
○ Expected Acceptance Rate 10%: Highly competitive program, skewed towards protein engineering, but admits many Koreans.
○ Actual Outcome: Rejected (24.02.15)
⑵ Harvard Biophysics (■ imaging; Spatiotemporal Omics)
○ Motivation for Application: Determined to apply due to professors like Xiaowei Zhuang.
○ Expected Acceptance Rate 10%: Similar to Stanford but with smaller cohorts. Had in-person contact with a Harvard Faculty member, which should have been a plus.
○ Actual Outcome: Rejected (24.03.07)
⑶ Michigan Bioinformatics (PIBS) (■ CS/BI; Spatial Bioinformatics)
○ Motivation for Application: Discovered the department while researching bioinformatics. Despite rarely admitting international students, developed an interest due to Professor Minji Kim’s research.
○ Expected Acceptance Rate 0%: Smaller cohorts, PIBS mentions admitting international students below 5%. Initially, students are responsible for their fees, though labs usually fund students.
○ Actual Outcome: Interview Offer (23.12.19) → Interview (24.02.01-02) → Accepted (24.02.09)
⑷ JHU Biomedical Engineering (■ CS/BI; Spatial Bioinformatics)
○ Motivation for Application: Wanted to apply because of professors like Jean Fan.
○ Expected Acceptance Rate 5%: Seems heavily focused on CS. Wonder if there’s a reason why the pre-contact failed.
○ Actual Outcome: Rejected (24.01.20)
⑸ UCLA Biochemistry, Molecular and Structural Biology (■ biochem; Spatiotemporal Omics)
○ Motivation for Application: Discovered the department after successful contact with Professor Roy Wollman. Decided to apply solely based on professors.
○ Expected Acceptance Rate 60%: Successful pre-meeting with the admission committee head. However, heavily skewed towards biochemistry.
○ Actual Outcome: Interview Offer (23.12.14) → Interview (24.01.09-10) → Accepted (24.01.24)
⑹ UCSD Bioinformatics and Systems Biology (■ imaging; Spatiotemporal Omics)
○ Motivation for Application: Discovered the department while researching bioinformatics, particularly single-cell omics.
○ Expected Acceptance Rate 30%: Seems like a good fit. Emphasizes quantitative aspects like GRE. Wonder if there’s a reason why the pre-contact failed.
○ Actual Outcome: Interview Offer (23.12.22) → Interview (24.01.19-22) → Rejected (24.02.28)
⑺ NYU Vilcek Institute of Graduate Biomedical Sciences (■ imaging; Spatiotemporal Omics)
○ Motivation for Application: Considered applying to NYU due to Professor Satija Rahul and the program’s broad scope.
○ Expected Acceptance Rate 45%: Large cohorts, high international acceptance rate. Emphasizes quantitative evaluation. Fits well with molecular biology over medical physics.
○ Actual Outcome: Rejected (24.01.13)
⑻ Wisconsin-Madison Biomedical Data Science (■ CS/BI; Spatial Bioinformatics)
○ Motivation for Application: Decided to apply due to a professor with strong connections.
○ Expected Acceptance Rate 30%: Seems like a decent fit, not too challenging.
○ Actual Outcome: Interview Offer (24.01.24) → Interview (24.01.31-02.08) → Rejected (24.02.22)
⑼ Wisconsin-Madison Medical Physics (■ imaging; Spatiotemporal Omics)
○ Motivation for Application: Decided to apply due to a professor with strong connections.
○ Expected Acceptance Rate 7%: Not a perfect fit but a connection to a faculty member provides an advantage. Thought MRI research might also help.
○ Actual Outcome: Interview Offer (24.01.11) → Interview (24.02.09-22) → Accepted (24.02.17)
⑽ UIUC Computer Science (■ CS/BI; Spatial Bioinformatics)
○ Motivation for Application: Thought CS fits well, offers a favorable career path. Interested in theoretical aspects like cryptography, information theory, and game theory.
○ Expected Acceptance Rate 10%: Fits with CS but considering high competition.
○ Actual Outcome: Rejected (24.03.16)
⑾ Georgia Tech Bioinformatics in College of Computing (■ CS/BI; Spatial Bioinformatics)
○ Motivation for Application: Interested in someone there.
○ Expected Acceptance Rate 30%: Surprisingly realized compatibility with CS. Extensive relevant experience might be advantageous.
○ Actual Outcome: Rejected (24.04.29)
⑿ Georgia Tech Bioinformatics in Biomedical Engineering (■ imaging; Spatiotemporal Omics)
○ Motivation for Application: Interested in someone there.
○ Expected Acceptance Rate 45%: Seems less challenging than GT CS. Additionally, the presence of faculty in spatial bioinformatics.
○ Actual Outcome: Interview Offer (24.01.20) → Interview (24.01.05, 03.08) → Accepted (24.03.13)
⒀ WashU Physics (■ physics; Spatial Bioinformatics)
○ Motivation for Application: Discovered the department after successful contact with Professor Mikhail Tikhonov. Initially considered physics but realized the fit wasn’t right for my career in the bio industry (mostly pharmaceuticals), so opted for an opportunity to learn machine learning properly.
○ Expected Acceptance Rate 85%: WashU Physics categorized as an unpopular department, hence lower competition. Faculty research fields vary, and successful pre-contact made a positive impression. Factors like a good GRE Physics score are promising.
○ Actual Outcome: Rejected (24.04.15)
⒁ UW Computer Science (■ CS/BI; Spatial Bioinformatics)
○ Motivation: Fits well with CS, and offers a favorable career path. Considering the strong emphasis on digital pathology, it aligns with the author’s research background in spatial bioinformatics.
○ Expected acceptance rate 10%: Although the fit with CS is considered good, the high competition rate is taken into account.
○ Actual result: Interview offer (24.01.22) → Interview (24.01.23) → Rejected (24.02.08)
⒂ UCSF Biological and Medical Informatics (■ imaging; Spatiotemporal Omics)
○ Motivation: Discovered the department through bioinformatics search. Although it’s a department that rarely accepts international students, applying because there are many professors related to spatial bioinformatics.
○ Expected acceptance rate 0%: Only 2 out of 55 total doctoral students are international.
○ Actual result: Rejected (23.12.22)
⒃ Tri-I Computational Biology and Medicine (■ CS/BI; Spatial Bioinformatics)
○ Motivation: Discovered the department through bioinformatics search. Several alumni are there and it’s also international-friendly.
○ Expected acceptance rate 35%: With many professors and achievements in spatial bioinformatics, the acceptance rate is expected to be high. However, being an Ivy League institution, it won’t be easy.
○ Actual result: Rejected (24.03.08)
⒄ Gerstner Sloan Kettering Graduate School Cancer Engineering (■ imaging; Spatiotemporal Omics)
○ Motivation: Applied after two pre-contact sessions via in-person and Zoom.
○ Expected acceptance rate 35%: Although the field isn’t a perfect match, strong GRE Physics scores and understanding of MRI increase the likelihood of acceptance.
○ Actual result: Interview offer (23.12.19) → Interview (24.01.16-18) → Accepted (24.02.02)
⒅ TSRI Computational Biology and Modeling (■ biochem; Spatiotemporal Omics)
○ Motivation: Unresolved lingering feelings. Despite that, the author considers the possibility of choosing this place.
○ Expected acceptance rate 60%: Considering last year’s near success, there’s a good chance of acceptance.
○ Actual result: Rejected (24.02.02)
⒆ 2024-2025 US Grad Admission Analysis
○ Comparison of expected pass rates and actual results.
import matplotlib.pyplot as plt
# Data mapping expected acceptance rates to actual outcomes (0: rejection, 0.5: interview only, 1: accepted)
data = {
"Stanford Bioengineering": (10, 0),
"Harvard Biophysics": (10, 0),
"Michigan Bioinformatics (PIBS)": (0, 1),
"JHU Biomedical Engineering": (5, 0),
"UCLA Biochemistry": (60, 1),
"UCSD Bioinformatics and Systems Biology": (30, 0.5),
"NYU Vilcek Institute": (45, 0),
"Wisconsin-Madison Biomedical Data Science": (30, 0.5),
"Wisconsin-Madison Medical Physics": (7, 1),
"UIUC Computer Science": (10, 0),
"Georgia Tech Bioinformatics (College of Computing)": (30, 1),
"Georgia Tech Bioinformatics (Biomedical Engineering)": (45, 1),
"WashU Physics": (85, 0),
"UW Computer Science": (10, 0.5),
"UCSF Biological and Medical Informatics": (0, 0),
"Tri-I Computational Biology and Medicine": (35, 0),
"Gerstner Sloan Kettering Cancer Engineering": (35, 1),
"TSRI Computational Biology and Modeling": (60, 0)
}
# Separating keys and values for plotting
expected_rates, actual_results = zip(*data.values())
# Recreating the scatter plot with updated y-axis label
plt.figure(figsize=(10, 6))
plt.scatter(expected_rates, actual_results, color='blue')
plt.title('Scatter Plot of Expected Acceptance Rates vs. Actual Results')
plt.xlabel('Expected Acceptance Rate (%)')
plt.ylabel('Actual Result')
plt.yticks([0, 0.5, 1], ['Rejected', 'Interview Only', 'Accepted'])
plt.grid(True)
plt.show()
○ Scoring rejections as 0, interview offers as 0.5, and final offers as 1.
○ Regression curve: y = (7.639e-05) x + (4.145e-01)
○ Spearman’s rho = 0.02050878 → almost negligible
○ R-squared = 0.001580409 (%) = predictable area → almost negligible
○ Conclusion: The actual results of the study abroad application process were completely different from the predicted results.
○ Admission Rates by QS Ranking 2024
import matplotlib.pyplot as plt
import pandas as pd
# Data mapping expected acceptance rates to actual outcomes (0: rejection, 0.5: interview only, 1: accepted)
data = {
"Stanford Bioengineering": (10, 0),
"Harvard Biophysics": (10, 0),
"Michigan Bioinformatics (PIBS)": (0, 1),
"JHU Biomedical Engineering": (5, 0),
"UCLA Biochemistry": (60, 1),
"UCSD Bioinformatics and Systems Biology": (30, 0.5),
"NYU Vilcek Institute": (45, 0),
"Wisconsin-Madison Biomedical Data Science": (30, 0.5),
"Wisconsin-Madison Medical Physics": (7, 1),
"UIUC Computer Science": (10, 0),
"Georgia Tech Bioinformatics (College of Computing)": (30, 1),
"Georgia Tech Bioinformatics (Biomedical Engineering)": (45, 1),
"WashU Physics": (85, 0),
"UW Computer Science": (10, 0.5),
"UCSF Biological and Medical Informatics": (0, 0),
"Tri-I Computational Biology and Medicine": (35, 0),
"Gerstner Sloan Kettering Cancer Engineering": (35, 1),
"TSRI Computational Biology and Modeling": (60, 0)
}
# QS rankings for each institution
rankings = {
"Stanford Bioengineering": 5,
"Harvard Biophysics": 4,
"Michigan Bioinformatics (PIBS)": 33,
"JHU Biomedical Engineering": 28,
"UCLA Biochemistry": 29,
"UCSD Bioinformatics and Systems Biology": 62,
"NYU Vilcek Institute": 38,
"Wisconsin-Madison Biomedical Data Science": 102,
"Wisconsin-Madison Medical Physics": 102, # Same as above, duplicated intentionally
"UIUC Computer Science": 64,
"Georgia Tech Bioinformatics (College of Computing)": 97,
"Georgia Tech Bioinformatics (Biomedical Engineering)": 97, # Same as above, duplicated intentionally
"WashU Physics": 154,
"UW Computer Science": 63,
"UCSF Biological and Medical Informatics": None,
"Tri-I Computational Biology and Medicine": None,
"Gerstner Sloan Kettering Cancer Engineering": None,
"TSRI Computational Biology and Modeling": None
}
# Filtering out programs without a QS ranking
filtered_data = {key: (value[1], rankings[key]) for key, value in data.items() if rankings[key] is not None}
# Create a DataFrame from the filtered data
df_data = pd.DataFrame.from_dict(filtered_data, orient='index', columns=['Actual Result', 'QS Ranking'])
df_data.reset_index(inplace=True, drop=True)
# Group data by actual result for box plot
grouped_data = [df_data[df_data['Actual Result'] == outcome]['QS Ranking'].dropna() for outcome in [0, 0.5, 1]]
# Creating the box plot
plt.figure(figsize=(10, 6))
plt.boxplot(grouped_data, labels=['Rejected', 'Interview Only', 'Accepted'])
plt.title('QS Rankings by Admission Outcome')
plt.xlabel('Admission Outcome')
plt.ylabel('QS Ranking (2024)')
plt.gca().invert_yaxis() # Higher rankings are numerically lower
plt.grid(True)
plt.show()
○ Conclusion: Observing only the median values, there is a trend of higher admissions at lower QS rankings.
○ Admission Rates by US News Ranking 2024
import matplotlib.pyplot as plt
import pandas as pd
# Data mapping expected acceptance rates to actual outcomes (0: rejection, 0.5: interview only, 1: accepted)
data = {
"Stanford Bioengineering": (10, 0),
"Harvard Biophysics": (10, 0),
"Michigan Bioinformatics (PIBS)": (0, 1),
"JHU Biomedical Engineering": (5, 0),
"UCLA Biochemistry": (60, 1),
"UCSD Bioinformatics and Systems Biology": (30, 0.5),
"NYU Vilcek Institute": (45, 0),
"Wisconsin-Madison Biomedical Data Science": (30, 0.5),
"Wisconsin-Madison Medical Physics": (7, 1),
"UIUC Computer Science": (10, 0),
"Georgia Tech Bioinformatics (College of Computing)": (30, 1),
"Georgia Tech Bioinformatics (Biomedical Engineering)": (45, 1),
"WashU Physics": (85, 0),
"UW Computer Science": (10, 0.5),
"UCSF Biological and Medical Informatics": (0, 0),
"Tri-I Computational Biology and Medicine": (35, 0),
"Gerstner Sloan Kettering Cancer Engineering": (35, 1),
"TSRI Computational Biology and Modeling": (60, 0)
}
# US News rankings for each institution
rankings = {
"Stanford Bioengineering": 3,
"Harvard Biophysics": 3,
"Michigan Bioinformatics (PIBS)": 21,
"JHU Biomedical Engineering": 9,
"UCLA Biochemistry": 15,
"UCSD Bioinformatics and Systems Biology": 28,
"NYU Vilcek Institute": 35,
"Wisconsin-Madison Biomedical Data Science": 35,
"Wisconsin-Madison Medical Physics": 35, # Same as above, duplicated intentionally
"UIUC Computer Science": 35,
"Georgia Tech Bioinformatics (College of Computing)": 33,
"Georgia Tech Bioinformatics (Biomedical Engineering)": 33, # Same as above, duplicated intentionally
"WashU Physics": 24,
"UW Computer Science": 40,
"UCSF Biological and Medical Informatics": 115,
"Tri-I Computational Biology and Medicine": None,
"Gerstner Sloan Kettering Cancer Engineering": None,
"TSRI Computational Biology and Modeling": None
}
# Filtering out programs without a US News ranking
filtered_data = {key: (value[1], rankings[key]) for key, value in data.items() if rankings[key] is not None}
# Create a DataFrame from the filtered data
df_data = pd.DataFrame.from_dict(filtered_data, orient='index', columns=['Actual Result', 'US News Ranking'])
df_data.reset_index(inplace=True, drop=True)
# Group data by actual result for box plot
grouped_data = [df_data[df_data['Actual Result'] == outcome]['US News Ranking'].dropna() for outcome in [0, 0.5, 1]]
# Creating the box plot
plt.figure(figsize=(10, 6))
plt.boxplot(grouped_data, labels=['Rejected', 'Interview Only', 'Accepted'])
plt.title('US News Rankings by Admission Outcome')
plt.xlabel('Admission Outcome')
plt.ylabel('US News Ranking (2024)')
plt.gca().invert_yaxis() # Higher rankings are numerically lower
plt.grid(True)
plt.show()
○ Acceptance rate trends by research direction (spatiotemporal omics vs spatial bioinformatics)
import matplotlib.pyplot as plt
import pandas as pd
# Data mapping expected acceptance rates to actual outcomes (0: rejection, 0.5: interview only, 1: accepted)
data = {
"Stanford Bioengineering": (10, 0, "Spatiotemporal Omics"),
"Harvard Biophysics": (10, 0, "Spatiotemporal Omics"),
"Michigan Bioinformatics (PIBS)": (0, 1, "Spatial Bioinformatics"),
"JHU Biomedical Engineering": (5, 0, "Spatial Bioinformatics"),
"UCLA Biochemistry": (60, 1, "Spatiotemporal Omics"),
"UCSD Bioinformatics and Systems Biology": (30, 0.5, "Spatiotemporal Omics"),
"NYU Vilcek Institute": (45, 0, "Spatiotemporal Omics"),
"Wisconsin-Madison Biomedical Data Science": (30, 0.5, "Spatial Bioinformatics"),
"Wisconsin-Madison Medical Physics": (7, 1, "Spatiotemporal Omics"),
"UIUC Computer Science": (10, 0, "Spatial Bioinformatics"),
"Georgia Tech Bioinformatics (College of Computing)": (30, 1, "Spatial Bioinformatics"),
"Georgia Tech Bioinformatics (Biomedical Engineering)": (45, 1, "Spatiotemporal Omic"),
"WashU Physics": (85, 0, "Spatial Bioinformatics"),
"UW Computer Science": (10, 0.5, "Spatial Bioinformatics"),
"UCSF Biological and Medical Informatics": (0, 0, "Spatiotemporal Omics"),
"Tri-I Computational Biology and Medicine": (35, 0, "Spatial Bioinformatics"),
"Gerstner Sloan Kettering Cancer Engineering": (35, 1, "Spatiotemporal Omics"),
"TSRI Computational Biology and Modeling": (60, 0, "Spatiotemporal Omics")
}
# Create a DataFrame from the data
df_data = pd.DataFrame.from_dict(data, orient='index', columns=['Expected Rate', 'Actual Result', 'Field'])
df_data.reset_index(inplace=True, drop=True)
# Group data by field for box plot
grouped_omics = df_data[df_data['Field'] == "Spatiotemporal Omics"]['Actual Result'].dropna()
grouped_bioinfo = df_data[df_data['Field'] == "Spatial Bioinformatics"]['Actual Result'].dropna()
# Creating the box plot
plt.figure(figsize=(10, 6))
plt.boxplot([grouped_omics, grouped_bioinfo], labels=['Spatiotemporal Omics', 'Spatial Bioinformatics'])
plt.title('Admission Outcomes by Field of Study')
plt.xlabel('Field of Study')
plt.ylabel('Admission Outcome')
plt.yticks([0, 0.5, 1], ['Rejected', 'Interview Only', 'Accepted'])
plt.grid(True)
plt.show()
○ Conclusion: Looking at the median, I had a higher acceptance rate in Spatial Bioinformatics, which is due to my relatively limited experimental experience.
○ Trend in acceptance rates by field of study (e.g., biochemistry)
import matplotlib.pyplot as plt
import pandas as pd
# Data mapping expected acceptance rates to actual outcomes (0: rejection, 0.5: interview only, 1: accepted) with new categories
data = {
"Stanford Bioengineering": (10, 0, "biochem"),
"Harvard Biophysics": (10, 0, "imaging"),
"Michigan Bioinformatics (PIBS)": (0, 1, "CS/BI"),
"JHU Biomedical Engineering": (5, 0, "CS/BI"),
"UCLA Biochemistry": (60, 1, "biochem"),
"UCSD Bioinformatics and Systems Biology": (30, 0.5, "imaging"),
"NYU Vilcek Institute": (45, 0, "imaging"),
"Wisconsin-Madison Biomedical Data Science": (30, 0.5, "CS/BI"),
"Wisconsin-Madison Medical Physics": (7, 1, "imaging"),
"UIUC Computer Science": (10, 0, "CS/BI"),
"Georgia Tech Bioinformatics (College of Computing)": (30, 1, "CS/BI"),
"Georgia Tech Bioinformatics (Biomedical Engineering)": (45, 1, "imaging"),
"WashU Physics": (85, 0, "physics"),
"UW Computer Science": (10, 0.5, "CS/BI"),
"UCSF Biological and Medical Informatics": (0, 0, "imaging"),
"Tri-I Computational Biology and Medicine": (35, 0, "CS/BI"),
"Gerstner Sloan Kettering Cancer Engineering": (35, 1, "imaging"),
"TSRI Computational Biology and Modeling": (60, 0, "biochem")
}
# Create a DataFrame from the data
df_data = pd.DataFrame.from_dict(data, orient='index', columns=['Expected Rate', 'Actual Result', 'Category'])
df_data.reset_index(inplace=True, drop=True)
# Group data by category for box plot
grouped_biochem = df_data[df_data['Category'] == "biochem"]['Actual Result'].dropna()
grouped_imaging = df_data[df_data['Category'] == "imaging"]['Actual Result'].dropna()
grouped_cs_bi = df_data[df_data['Category'] == "CS/BI"]['Actual Result'].dropna()
grouped_physics = df_data[df_data['Category'] == "physics"]['Actual Result'].dropna()
# Creating the box plot
plt.figure(figsize=(10, 6))
plt.boxplot([grouped_biochem, grouped_imaging, grouped_cs_bi, grouped_physics],
labels=['Biochem', 'Imaging', 'CS/BI', 'Physics'])
plt.title('Admission Outcomes by Research Category')
plt.xlabel('Research Category')
plt.ylabel('Admission Outcome')
plt.yticks([0, 0.5, 1], ['Rejected', 'Interview Only', 'Accepted'])
plt.grid(True)
plt.show()
○ Conclusion: The author has competitiveness in Imaging and CS/BI fields.
○ Depicting the trend of decline over time using a Kaplan-Meier survival curve.
from lifelines import KaplanMeierFitter
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Actual admission data
admission_data = {
'Program': [
'Stanford Bioengineering', 'Harvard Biophysics', 'Michigan Bioinformatics (PIBS)',
'JHU Biomedical Engineering', 'UCLA Biochemistry', 'UCSD Bioinformatics and Systems Biology',
'NYU Vilcek Institute', 'Wisconsin-Madison Biomedical Data Science', 'Wisconsin-Madison Medical Physics',
'UIUC Computer Science', 'Georgia Tech Bioinformatics (College of Computing)',
'Georgia Tech Bioinformatics (Biomedical Engineering)', 'WashU Physics', 'UW Computer Science',
'UCSF Biological and Medical Informatics', 'Tri-I Computational Biology and Medicine',
'Gerstner Sloan Kettering Cancer Engineering', 'TSRI Computational Biology and Modeling'
],
'Decision_Date': [
'2024.02.15', '2024.03.07', '2024.02.09', '2024.01.20', '2024.01.24', '2024.02.28',
'2024.01.13', '2024.02.22', '2024.02.17', '2024.03.16', '2024.03.13', '2024.03.13',
'2024.04.15', '2024.02.08', '2023.12.22', '2024.03.08', '2024.02.02', '2024.02.02'
],
'Status': [
'Declined', 'Declined', 'Accepted', 'Declined', 'Accepted', 'Declined',
'Declined', 'Declined', 'Accepted', 'Declined', 'Accepted', 'Accepted',
'Declined', 'Declined', 'Declined', 'Declined', 'Accepted', 'Declined'
]
}
for i in range(18):
if admission_data['Status'][i] == 'Accepted':
admission_data['Decision_Date'][i] = '2024.05.01' # An arbitrary large value
# Convert to DataFrame
df_admissions = pd.DataFrame(admission_data)
# Map the 'Status' to a numerical value, where 1 indicates an event (decline) occurred
df_admissions['Event_Occurred'] = df_admissions['Status'].apply(lambda x: 1 if x == 'Declined' else 0)
# Assume all applications were submitted on December 1, 2023 (this is the 'start' of our study)
application_start_date = pd.to_datetime('2023-12-01')
df_admissions['Decision_Date'] = pd.to_datetime(df_admissions['Decision_Date'])
df_admissions['Days'] = (df_admissions['Decision_Date'] - application_start_date).dt.days
# Fit the Kaplan-Meier survival estimator on the data
kmf = KaplanMeierFitter()
kmf.fit(df_admissions['Days'], event_observed=df_admissions['Event_Occurred'])
# Define a time range from the start date to the end of April 2024 for the plot
end_date = pd.to_datetime('2024-04-15')
time_range = (end_date - application_start_date).days
timeline = range(0, time_range + 1)
# Plot the survival function over the defined timeline
kmf.plot_survival_function()
# Convert numerical days back to dates for plotting on the x-axis
xticks_days = np.linspace(0, time_range, num=5)
xticks_dates = [application_start_date + pd.Timedelta(days=day) for day in xticks_days]
plt.xticks(xticks_days, [date.strftime('%b') for date in xticks_dates])
# Display the plot with labels
plt.title('Kaplan-Meier Survival Curve for Admission Decisions')
plt.xlabel('Month of Decision')
plt.ylabel('Survival Probability (No Decline)')
plt.grid(True)
plt.show()
○ Note that there are minor errors related to the x-axis labels, such as April 15 appearing as if it were April 1.
○ It is important to note that top schools like Harvard and Stanford are known for releasing their decline results quite late.
○ The shape of the function closely resembles a 4PL regression curve.
○ Trends of decline over time by research direction (spatiotemporal omics vs spatial bioinformatics).
from lifelines import KaplanMeierFitter
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Actual admission data
admission_data = {
'Program': [
'Stanford Bioengineering', 'Harvard Biophysics', 'Michigan Bioinformatics (PIBS)',
'JHU Biomedical Engineering', 'UCLA Biochemistry', 'UCSD Bioinformatics and Systems Biology',
'NYU Vilcek Institute', 'Wisconsin-Madison Biomedical Data Science', 'Wisconsin-Madison Medical Physics',
'UIUC Computer Science', 'Georgia Tech Bioinformatics (College of Computing)',
'Georgia Tech Bioinformatics (Biomedical Engineering)', 'WashU Physics', 'UW Computer Science',
'UCSF Biological and Medical Informatics', 'Tri-I Computational Biology and Medicine',
'Gerstner Sloan Kettering Cancer Engineering', 'TSRI Computational Biology and Modeling'
],
'Decision_Date': [
'2024.02.15', '2024.03.07', '2024.02.09', '2024.01.20', '2024.01.24', '2024.02.28',
'2024.01.13', '2024.02.22', '2024.02.17', '2024.03.16', '2024.03.13', '2024.03.13',
'2024.04.15', '2024.02.08', '2023.12.22', '2024.03.08', '2024.02.02', '2024.02.02'
],
'Status': [
'Declined', 'Declined', 'Accepted', 'Declined', 'Accepted', 'Declined',
'Declined', 'Declined', 'Accepted', 'Declined', 'Accepted', 'Accepted',
'Declined', 'Declined', 'Declined', 'Declined', 'Accepted', 'Declined'
],
'Field': [
"Spatiotemporal Omics", "Spatiotemporal Omics", "Spatial Bioinformatics",
"Spatial Bioinformatics", "Spatiotemporal Omics", "Spatiotemporal Omics",
"Spatiotemporal Omics", "Spatial Bioinformatics", "Spatiotemporal Omics",
"Spatial Bioinformatics", "Spatial Bioinformatics", "Spatiotemporal Omic",
"Spatial Bioinformatics", "Spatial Bioinformatics", "Spatiotemporal Omics",
"Spatial Bioinformatics", "Spatiotemporal Omics", "Spatiotemporal Omics"
]
}
for i in range(18):
if admission_data['Status'][i] == 'Accepted':
admission_data['Decision_Date'][i] = '2024.05.01' # An arbitrary large value
# Create a DataFrame from the data
df_admissions = pd.DataFrame(admission_data)
# Map the 'Status' to a binary value, where 1 indicates an event (decline) occurred
df_admissions['Event_Occurred'] = df_admissions['Status'].apply(lambda x: 1 if x == 'Declined' else 0)
# Assume all applications were submitted on December 1, 2023 (this is the 'start' of our study)
application_start_date = pd.to_datetime('2023-12-01')
df_admissions['Decision_Date'] = pd.to_datetime(df_admissions['Decision_Date'])
df_admissions['Days'] = (df_admissions['Decision_Date'] - application_start_date).dt.days
# Fit the Kaplan-Meier survival estimator for each 'Field'
kmf = KaplanMeierFitter()
# Plot setup
fig, ax = plt.subplots(figsize=(10, 6))
# Colors for the plot
colors = ['blue', 'green']
# Perform the Kaplan-Meier fit and plot for each field
for i, field in enumerate(df_admissions['Field'].unique()):
if i > 1:
continue
# Filter the data for the field
field_data = df_admissions[df_admissions['Field'] == field]
# Fit the Kaplan-Meier model
kmf.fit(field_data['Days'], event_observed=field_data['Event_Occurred'], label=field)
# Define a time range from the start date to the end of April 2024 for the plot
end_date = pd.to_datetime('2024-04-15')
time_range = (end_date - application_start_date).days
timeline = range(0, time_range + 1)
# Plot the survival curve
kmf.plot_survival_function(ax=ax, color=colors[i])
# Convert numerical days back to dates for plotting on the x-axis
xticks_days = np.linspace(0, time_range, num=5)
xticks_dates = [application_start_date + pd.Timedelta(days=day) for day in xticks_days]
plt.xticks(xticks_days, [date.strftime('%b') for date in xticks_dates])
plt.title('Kaplan-Meier Survival Analysis by Research Direction')
plt.xlabel('Days Since Application Start Date')
plt.ylabel('Survival Probability')
plt.legend(title='Research Direction')
plt.grid(True)
plt.show
○ Conclusion: The spatiotemporal omics side related to the experiment shows a faster decline trend.
○ Trends of decline over time by research field (e.g., biochemistry).
from lifelines import KaplanMeierFitter
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Actual admission data
admission_data = {
'Program': [
'Stanford Bioengineering', 'Harvard Biophysics', 'Michigan Bioinformatics (PIBS)',
'JHU Biomedical Engineering', 'UCLA Biochemistry', 'UCSD Bioinformatics and Systems Biology',
'NYU Vilcek Institute', 'Wisconsin-Madison Biomedical Data Science', 'Wisconsin-Madison Medical Physics',
'UIUC Computer Science', 'Georgia Tech Bioinformatics (College of Computing)',
'Georgia Tech Bioinformatics (Biomedical Engineering)', 'WashU Physics', 'UW Computer Science',
'UCSF Biological and Medical Informatics', 'Tri-I Computational Biology and Medicine',
'Gerstner Sloan Kettering Cancer Engineering', 'TSRI Computational Biology and Modeling'
],
'Decision_Date': [
'2024.02.15', '2024.03.07', '2024.02.09', '2024.01.20', '2024.01.24', '2024.02.28',
'2024.01.13', '2024.02.22', '2024.02.17', '2024.03.16', '2024.03.13', '2024.03.13',
'2024.04.15', '2024.02.08', '2023.12.22', '2024.03.08', '2024.02.02', '2024.02.02'
],
'Status': [
'Declined', 'Declined', 'Accepted', 'Declined', 'Accepted', 'Declined',
'Declined', 'Declined', 'Accepted', 'Declined', 'Accepted', 'Accepted',
'Declined', 'Declined', 'Declined', 'Declined', 'Accepted', 'Declined'
],
'Category': [
"biochem", "imaging", "CS/BI",
"CS/BI", "biochem", "imaging",
"imaging", "CS/BI", "imaging",
"CS/BI", "CS/BI", "imaging",
"physics", "CS/BI", "imaging",
"CS/BI", "imaging", "biochem"
]
}
for i in range(18):
if admission_data['Status'][i] == 'Accepted':
admission_data['Decision_Date'][i] = '2024.05.01' # An arbitrary large value
# Create a DataFrame from the data
df_admissions = pd.DataFrame(admission_data)
# Map the 'Status' to a binary value, where 1 indicates an event (decline) occurred
df_admissions['Event_Occurred'] = df_admissions['Status'].apply(lambda x: 1 if x == 'Declined' else 0)
# Assume all applications were submitted on December 1, 2023 (this is the 'start' of our study)
application_start_date = pd.to_datetime('2023-12-01')
df_admissions['Decision_Date'] = pd.to_datetime(df_admissions['Decision_Date'])
df_admissions['Days'] = (df_admissions['Decision_Date'] - application_start_date).dt.days
# Fit the Kaplan-Meier survival estimator for each 'Category'
kmf = KaplanMeierFitter()
# Plot setup
fig, ax = plt.subplots(figsize=(10, 6))
# Colors for the plot
colors = ['red', 'orange', 'green', 'blue']
# Perform the Kaplan-Meier fit and plot for each category
for i, category in enumerate(df_admissions['Category'].unique()):
if i > 3:
continue
# Filter the data for the category
category_data = df_admissions[df_admissions['Category'] == category]
# Fit the Kaplan-Meier model
kmf.fit(category_data['Days'], event_observed=category_data['Event_Occurred'], label=category)
# Define a time range from the start date to the end of April 2024 for the plot
end_date = pd.to_datetime('2024-04-15')
time_range = (end_date - application_start_date).days
timeline = range(0, time_range + 1)
# Plot the survival curve
kmf.plot_survival_function(ax=ax, color=colors[i])
# Convert numerical days back to dates for plotting on the x-axis
xticks_days = np.linspace(0, time_range, num=5)
xticks_dates = [application_start_date + pd.Timedelta(days=day) for day in xticks_days]
plt.xticks(xticks_days, [date.strftime('%b') for date in xticks_dates])
plt.title('Kaplan-Meier Survival Analysis by Research Direction')
plt.xlabel('Days Since Application Start Date')
plt.ylabel('Survival Probability')
plt.legend(title='Research Direction')
plt.grid(True)
plt.show
○ Conclusion: The imaging field tends to decline more rapidly.
2. Study Abroad Success Story
○ Ultimately, the SOP should contain content related to the big question.
○ Like a bold entrepreneur exciting people’s minds, it’s desirable for the SOP to contain a future scientific vision that moves the professors’ hearts.
○ However, emphasizing even excellent skills in techniques like Western Blotting from decades ago will only have a counterproductive effect.
○ Therefore, it’s important to assert the determination to complete this conception rather than emphasizing being a typical Asian who listens well.
○ Nevertheless, being too bold may backfire, so maintaining an appropriate stance is crucial.
○ The safest option is to select a future trend that resonates with many professors, referencing historical and current technological advancements.
○ In the author’s case, envisioning a future where spatiotemporal omics is standard technology across red, white, and green bio realms.
○ Impressions of Each Program
○ UCLA BMSB: Feels like a field that requires a lot of soft knowledge even within dry-lab work. Biochem and chemoinformatics seem strong, possibly due to the presence of global pharmaceutical companies in California where UCs are located. However, the lack of visible stipends may be related to the low funding within UCs and the student protests within UCs during the 2023-2024 admissions.
○ UMich Bioinformatics: Graduation typically takes 3 to 5 years, but for master’s degree holders, it seems to be around 3 to 4 years. Some PIs aim to graduate students in just 3 years. Fields like information theory require a lot of hard knowledge even within dry-lab work. However, there are efforts to increase the survivability of BI departments by establishing wet-labs for independent data production. Tuition fee full funding + ~40k USD/yr. It’s puzzling to be accepted given the scarcity of international students during the PhD.
○ GT Bioinformatics: Looking at graduates’ career paths, it seems that GT has better FAANG employment prospects than UMich (including not only CS but also Bioinformatics programs). Although there were rumors of lower stipends, the author received full tuition funding + ~40k USD/yr stipend. Applied to both CS and BME for bioinformatics, as the degree eventually appears as bioinformatics, calling it duplicate application. BME seems to be jointly operated with Emory University, located nearby. There are many international students.
○ Sloan Kettering Institute Cancer Engineering: When visiting Sloan, connected with Rockefeller University and Weill Cornell, saw the forefront of bioengineering research. For example, research on using different fluorescent labels on transcription factors (TFs) and using light interference and FRET to visualize the spatial and temporal distance between TFs. Such fields that can fill one universe seem to be recent trends, and it seems that becoming a PI in the BME field requires such efforts. Computational tracks are abundant, but it feels like 80% wet-lab + 20% dry-lab. Considering the cost of living in New York City, has a tuition fee full funding + ~50k USD/yr stipend, like Columbia University.
○ UW-Madison Medical Physics: Program centered around imaging/nuclear medicine such as PET, MRI, and US. Admission is by direct admit, and if the applicant has already found a compatible PI, the interview process, which the applicant needs to arrange with the PI, can be easily resolved. Although the author did not attend this program, it seems that contacted PIs diligently help find other PIs suitable for the applicant, making the interview process seem guaranteed for acceptance.
○ ‘Opportunities’ come cruelly disguised as ‘failures’. What did you gain and learn from the failures you unexpectedly encountered?
○ One step forward from now on. I realize there are plenty of things I want to do in the future.
○ Existential awakening. I feel like I could be happy even if I were stranded on a desert island with only paper and pen. Therefore, I declare that I will not seek inner peace from people.
○ I want to live a fiercely intense life beyond imagination. I want to strive to be seen by the giants who move the world. My mind will never rest for even a week. How harsh and lonely will that process be for me? Nevertheless, I will boil even more. It’s reminiscent of a phrase from Trump.
○ The reason I chose Michigan
○ While attending AACR 2024, I had the opportunity to meet professors like Aviv Regev, Christina Curtis, Dana Pe’er, Sanja Vickovic, and Joakim Lundeberg. From their lectures, I learned that spatial biology has the potential to revolutionize the entire pharmaceutical industry. Michigan stood out as the best place to further my studies in what I believe to be the next phase of spatial biology.
3. Future Plans
○ Regarding Direction
○ During the study abroad application period, I thought about UCLA, but life doesn’t flow according to the initial direction set, much like human relationships. It’s as if first love never comes true.
○ However, ‘direction’ gives momentum, and the actual direction of life tends to be somewhat similar to the initial direction.
○ Why I Benefit Others
○ There was once a person who had many dreams but couldn’t fully realize their potential due to the difficulties of reality. However, that person thirsted for success, so I guided them on the path of studying abroad. In the end, that person entered a master’s program, received full funding, and transferred to a doctoral program within a year. It was reported that they would work as an intern software engineer at Amazon while in graduate school.
○ Giving someone who has lost their dreams a life to look forward to tomorrow. I just want to do that because it looks good, and I want to give hope to those who have lost hope.
○ Industry
○ Regarding Industry
○ The lifetime expected earnings in the US job market are several times higher than other career paths.
○ It is considered strategic to experience the big tech industry, regardless of the path one chooses.
○ The recent rise in attention to astrobiology is ultimately because this field has become scalable and automated with high-throughput. Now, academia and industry are inseparable.
○ Direction 1: I started thinking about working at a company called xAI founded by Elon Musk.
○ Reason: Contemplating sustainable AI industries. The existential threat of AI has clearly approached reality.
○ Evaluation: A newly launched startup with only 8 to 10 employees currently. It is expected to grow to dozens to hundreds of employees in about four years. With the motto ‘Understand the Universe,’ I feel like I’ll meet people similar to me.
○ Prediction: It is predicted that AI big tech will become a three-way battle among three companies: Google DeepMind, OpenAI, and X.ai. However, Google DeepMind and OpenAI have not yet provided appropriate answers to the coexistence of AI and humanity. (ref)
○ Direction 2: Global pharmaceutical companies like Pfizer, Moderna, etc.: There is high demand for bioinformaticians.
○ Academia
○ Regarding Academia
○ It seems quite valuable to aim to become a person who is not only a master in academia but also holds significant social influence.
○ Seeing researchers from SciLifeLab, who developed Visium, become professors at Stanford, Columbia University, and others, I realize that developing substantial technologies that can change the era is a prerequisite to becoming a professor at top-tier schools.
○ Direction 1: Academic-style Postdoc: Direction towards prestigious universities like Harvard, Stanford, etc.
○ Direction 2: Laboratory-style Postdoc: Allen Institute, Broad Institute, Sanger Institute, Francis Crick Institute, etc.
○ Direction 3: Corporate-style Postdoc: IT includes Amazon; bio includes pharmaceutical companies like the Novartis Innovation Postdoctoral Fellowship Program.
○ Example of recruitment/postdoc/professorship notices
○ Pfizer
○ Business
○ Eventually, I might have to become someone who creates tens of thousands of high-paying jobs. Moreover, there aren’t many people in South Korea who can do that.
○ The industrial upheaval caused by AI is faster than I thought. Therefore, I want to create industries of macroscopic scale that cannot be easily applied by AI, creating many jobs that cannot be replaced by AI.
○ Politics
○ Contradictions in Politics: This is a space for writing about the contradictions in the world that I would like to change rather than a desire to become a politician.
○ Low Birth Rate: A number as extreme as a quarterly birth rate of 0.6, which could only be seen in extreme wartime situations, has emerged. This is the most difficult math problem I have ever seen. To solve it, I am contemplating and planning the following solutions.
○ Deploying AI and robots to significantly reduce labor demand.
○ Attracting core industries to increase high-quality jobs.
○ Making all advanced knowledge public to reduce lifelong education costs and promote individuals’ social advancement. In particular, the reluctance of women in the metropolitan area and with high education to marry and have children seems to be related to delayed social advancement and career interruptions.
○ Suppressing dating and marriage and popularizing arranged marriages. If the timing of social advancement is delayed and the age of marriage is postponed, then one can simply refrain from dating.
○ Other immigration policies, childcare policies, women’s policies, etc., will likely fall within the jurisdiction of administrators rather than scholars.
○ Innovative Industries: The emergence of OpenAI’s ChatGPT implies that almost all industries can be automated in the future. Industries that cannot be replaced by AI in the future should be high-context industries with large scale. These industries can create tens of thousands of high-paying jobs and are ideal industries where high-wage and low-wage workers coexist. I diagnose as follows. Note that once full automation is achieved, the IT industry will find it difficult to remain an independent industry.
○ Bio: Mainly pharmaceutical development. Pharmaceuticals are much larger in the market than food and cosmetics. Even Orion, a chocolate pie company, now makes pharmaceuticals! (ref)
○ Quantum: Quantum computing, quantum communication, nuclear fusion, etc.
○ Space: SpaceX and others.
○ Environment: The only potential innovative industry that has not yet been industrialized. What I want to do is to combine bio, IT, and the environment into one. (ref)
○ Additionally,
○ OECD suicide rate ranked first.
○ Political polarization: Since polarization is severe even in the United States, it is not a problem unique to South Korea. Switzerland is worth referring to, as Switzerland achieved political stability because it was surrounded by the Himalayas and was not invaded by foreign powers.
Input: 2023.11.26 11:01
Revised: 2024.02.18 21:31