Posts

Introduction Trials and tribulations The solution Introduction Drama, intrigue, arrogance, dashed hopes, rock-bottom, perseverance, and eventual triumph, this post has it all! It starts with me watching Rachael Tatman’s recent live-coding video, and ends with a thrilling race-to-the-bottom between two pathetically slow functions. What lies ahead: many a WTF moment, lots of trial and error, and some useful tidyverse data wrangling tips. Rachael Tatman is a data scientist at Kaggle, and does these awesome live coding sessions every Friday.

CONTINUE READING

Disclamer: I’m a trained microbiologist/biochemist, which means most of my bioinformatics knowledge was self-taught. What you’re about to see may not be pretty; the code might be janky or the workflow inefficient. But I have gone through countless hours of googleing, reading, and trial/error to learn this, and it works pretty well for me, so it might for you too. Let me know if you spot errors or have suggestions for improvement!

CONTINUE READING

The more I use the tidyverse in my R coding, the more I ask myself: does Hadley Wickham hate dogs, or does he just need help with dog-related package names? See, of the packages Hadley has developed for the tidyverse, there are two that have cat-inspired names (forcats and purrr) but zero that pay homage to man’s best friend. It’s not like doggo names are hard to think of for R packages it took me 30 seconds to come up with baRk and woofR**.

CONTINUE READING

I’ve been going through the job application cycle recently, which meant updating my CV. You can write a CV with Microsoft Word, but I find it exceptionally frustrating to do any sort of fancy formatting in Word, and more imporantly, I want my CV to be a page on my website (not just a downloadable file), that has the responsiveness expected of any modern webpage. I found this excellent HTML/CSS template from Thomas Hardy, and decided it was the aesthetic I was going for.

CONTINUE READING

This week I’ve been ploughing through final figure revisions for a big paper that’s been a couple years in the making 👏👏👏. Everything was going (relatively) smooth until I got to a tree I was trying to plot with associated barcharts. The idea was to summarize some data on major clades in this tree by putting barcharts of summary statistics aligned with each major clade to the side of the tree.

CONTINUE READING

Recently, as part of my work characterizing plant cell walls, I needed to express a few proteins that would serve as molecular probes. I read a couple of papers and the boss man gave me the green light. The first step is to find the protein sequence and have the gene synthesized so that we can transform it into some E. coli and start expression tests. So to order the synthesized gene I went to the paper that described the protein, scoured the methods, and found that–as expected–they didn’t give the sequence, they simply referred to a previous paper, which referred to a previous paper, and after going down the rabbit hole I finally found the original reference.

CONTINUE READING

1 Introduction 1.1 Essential Tools and Basic Knowledge 1.1.1 Fasta Format 1.1.2 Notepad++ 1.1.3 BioEdit 1.1.4 Unix Terminal 1.1.5 R 1.2 Getting Data 1.2.1 Keyword Search 1.2.2 BLAST 1.3 Cleaning Data 1.3.1 Concatenate Sequences 1.3.2 Remove Duplicate Sequences 1.3.3 Remove False Hits 1.3.4 Trim Extra Domains 1.3.5 Clean up Names 1.4 Conclusions Disclamer: I’m a trained microbiologist/biochemist, which means most of my bioinformatics knowledge was self-taught.

CONTINUE READING

Disclamer: I’m a trained microbiologist/biochemist, which means most of my bioinformatics knowledge was self-taught. What you’re about to see may not be pretty; the code might be janky or the workflow inefficient. But I have gone through countless hours of googleing, reading, and trial/error to learn this, and it works pretty well for me, so it might for you too. Let me know if you spot errors or have suggestions for improvement!

CONTINUE READING

Well, as promised to follow up my Friday Fails post, here’s a Saturday Success. I had a few science successes this week, like collecting some decent AFM images, learning purrr, or posting on this blog and tweeting (two of my outreach-related goals). But what I’m most proud of, is my recent decision to try reducing my plastic use and the waste I generate. It gives me so much anxiety and makes me feel very helpless and sad to think about the future of our planet.

CONTINUE READING

Welcome to a new series: Friday Fails. I’m aiming for weekly consistency with this one, but some may weeks I may just Tweet my fails (and inevitably miss some posts, look out for an upcoming Friday Fail on not posting the previous week!). The reason for this series is not to denigrate myself, it’s in the interest of honesty. The point is to showcase the real me, or at least a less shiny version… shiny as in that girl in your lab who has an NSF grant, 3 papers, and manages to show up every day looking 💯; or that professor who got tenure while raising 2 kids and publishing her ass off.

CONTINUE READING