Sunday, May 5, 2013

Wrap up of the #SMBEeuks meeting and QIIME workshop at UC Davis

Thanks to everyone who attended the SMBE Satellite Meeting on Eukaryotic -Omics at UC Davis last week (April 29 - May 2, 2013). The event was a resounding success - it was wonderful to meet participants from such diverse backgrounds, working on different aspects of eukaryotic genomics and biodiversity studies. Many thanks to meeting sponsors SMBE, MOBIO and Illumina for their generous financial support. Fingers crossed for other similar meetings in the future! 

For reference, all meeting documents are available here:
Twitter discussions that took place at the meeting each day have been compiled using Storify (a great online tool that collects tweets before Twitter locks them away in their archive):
Some speakers have posted their slides online - hopefully I can expand this list as I convince more participants to share their talks and posters (updated 5/18):
On Thursday morning we held breakout group sessions to discuss the overall themes at the #SMBEeuks meeting, and put forward some recommendations for increasing the pace of scientific progress in Eukaryotic -Omics fields. A general discussion took place before we broke off into two smaller groups for more specific discussions. Notes are posted here:
Finally, my deepest thanks to Laura Wegener Parfrey, Tony Walters, and Adam Robbins-Pianka for running a fantastic QIIME workshop after the #SMBEeuks meeting. We had a packed room of eager biologists who were ready to pick up some command line expertise. QIIME workshop documents are posted below - additional thanks to microBEnet and the Alfred P. Sloan foundation who supported this workshop!

Thursday, March 28, 2013

Primer tests for Fungal ITS regions...plus, statistics!

Reading a good paper is so inherently satisfying--and if you want to share my satisfaction, I recommend this recent piece of literature:
Bazzicalupo AL, Bálint M, Schmitt I. (2013) Comparison of ITS1 and ITS2 rDNA in 454 sequencing of hyperdiverse fungal communities. Fungal Ecology, 6(1):102–9. 
I only wish this paper wasn't paywalled, because it contains quite a bit of useful information that is extremely relevant for the environmental sequencing community.

Firstly, the authors carried out a comparison of ITS primer sets and assessed their ability (and overlap) in recovering different fungal Orders, Families, Genera, and Species. I'm a big fan--these type of primer comparisons are important for figuring out what we might be missing in any given PCR-based approach.
Our results suggest that ITS2 may be more variable and recovers more of the molecular diversity. We confirm an earlier in silico study showing that ITS1 and ITS2 yielded somewhat different taxonomic community compositions when blasted against public databases. However, we demonstrate that both ITS1 and ITS2 reveal similar patterns in community structure when analyzed in a community ecology context. [Bazzicalupo et al. 2013]
Secondly, I feel like I learned some statistics by reading this paper! Or at least, I understood why authors chose the methods they did. I really liked that this paper includes detailed explanation of the statistical tests used to assess the ITS regions and make OTU comparisons. For example:
We compared OTU abundance distributions between the ITS1 and ITS2 datasets at all similarity levels with the KolmogoroveSmirnov (KS) test to see whether the ITS1 or ITS2 would project higher OTU rich- ness in the samples. KS tests are often used to test the distribution of datasets against other distributions, so one may use it to test if a dataset is e.g. normally distributed (Conover 1999). However, the KS test may also be used to compare the shapes of two empirical distributions. Species abundance distributions contain information about both the richness and evenness, thus the comparison of distributions is more meaningful than comparing the means of distributions with e.g. t-tests (Phillips et al. 2012). [Bazzicalupo et al. 2013]
I don't have a strong statistics background (but I'm very aware that I need to become more competent in this area), and this paper helped me understand what types of statistical tests I could apply to environmental sequence data in future analyses. In this regard, the Bazzicalupo et al. methods section was a great change of narrative, compared to the stats-name-dropping-without-explanation I see so often in other papers.




Wednesday, March 27, 2013

International Research Coordination Network for Biodiversity of Ciliates

I was browsing through this NSF report on Dimensions of Biodiversity Projects 2010-2012, and I stumbled across this project which I had no idea even existed!


Upon further investigation, I discovered that this ciliates RCN has a portal website (including a document listing "Grand Challenges" in the study of ciliates). The inaugural meeting took place in September 2012 at NESCent, so it looks like the RCN is still in the early stages.

Frustratingly, the website doesn't seem to have been updated in quite a while (May 2012), so there isn't much new information about workshop outcomes or upcoming RCN activities. I'm excited to keep tabs on this new community - the discussions and outputs will be very relevant to high-throughput environmental sequencing approaches.


Wednesday, February 6, 2013

Content is King (Part 1): Social Media strategies according to Evan Bailyn


**This is a cross post from microBEnet, a portal website for the Alfred P. Sloan Foundation's program focused on the Microbiology of the Built Environment. This is one Eisen lab project I'm heavily involved in, since microBEnet thinks a lot about social media.**

Part of what we're trying to do is to put the net in microBEnet. As in, building an online network for an emerging research discipline (Microbiology of the Built Environment) that connects building scientists and engineers with biologists, ecologists and computer scientists.

The internet is a big place. Publicizing a new cause or web portal can be overwhelming, even for those of us who think we know what we're doing.

Since the social media realm is largely considered "untested waters" (and us scientists are hardly Silicon Valley insiders) there's a lot of experimentation. Figuring out what works and what doesn't. Trying new strategies as the web evolves.

Until a few months ago, I was completely unfamiliar with the concept of Search Engine Optimization (SEO). But then I attended a talk by Evan Bailyn, author and web entrepreneur who has extensive practical experience with the worlds of web searches and social media.

Evan's #1 commandment for building an online presence (a brand, your professional reputation, or an online community such as microBEnet)? Create excellent and unique content, frequently - ideally every day. This will not only draw people into your site, but it will dramatically improve your search engine rankings.

To achieve internet domination, Evan outlines his 3-step "Nuclear Football" approach (a meaningless phrase, but you'll remember it):

1) Create content


Content reigns supreme. Your content will define your unique voice--its what will get people hooked on your site, and separate you from everyone else on the internet. In ecological terms, you must define your niche or face extinction.

Every site should have a blog - this is one of the easiest ways to publish new content. The next step is to define your audience (who will you target?), and figure out what content would be interesting for them. Once you have strategized and set up your site, try to update the blog with new posts every day. However! Don't just post for the sake of posting - to capture audiences there needs to be real passion emanating from your daily content. Don't think about SEO, web traffic or future accolades--it will only create stress and cripple your efforts to create really awesome content. Also never be afraid to cater to the average person's low level entertainment desire (why do you think we use such low brow humor on Deep Sea News, a marine science blog that I contribute to). Evan even suggested testing content on forums such as Reddit--what will garner the most interest from your audience?

Evan's general blogging guidelines: posts should be a minimum of 500 words, with a minimum of 3 images. Website images should be high-resolution (don't use those cheesy stock photos), and have captions that will help improve your search engine ranking. Blog posts should have appropriate and descriptive titles; this is vitally important for people finding your web content via search engines.

Wordpress software (what we use for microBEnet) is particularly friendly for SEO. Plugins such as Yoast allow you to manually edit meta page titles (the text that appears in the top bar of your browser application, next to the minimize/maximize buttons - NOT what's on the webpage itself). The meta page title is normally filled in automatically in Wordpress from the blog post title, but editing this to add in a few more keywords will help bring more viewers to your site. Consider blog post and meta page titles as the "prime real estate" for search engines; the order of words doesn't matter, but you should write a meta page title that is readable by both humans and computers.

Specific titles will increase your exposure to your target audience--people who are seriously researching a subject (or about to make a purchase) will use what are called long-tail search terms. For example, someone Googling "microbes" might just be browsing around or looking for a link to Wikipedia, but a person using a more specific phrase like "microbes that live in air conditioners" is desperate for specialized content. The latter person would represent the serious audience for microBEnet - so its in our best interests to appear high in the search rankings for that given phrase!

Surprisingly, URLs aren't as important as they used to be - these were devalued by Google about a year ago. Even more reason to pay close attention to page titles.

2) Reach out to get links and exposure


Once your website is established, content is being created, and all the SEO tools are in place, its time to get the word out.

The internet is a big place--even with the best content, your site will lurk in the darkness without external support. Google uses links AND page titles to determine search results. Evan underlined that Google's fundamental search strategy hasn't changed in 8 years, despite frequent tweaks to the algorithm. Algorithm updates are mainly meant to improve Google's ability to eliminate potentially spammy links; Google does not like it when websites pay $$ for links (this is a spammy strategy carried out for the sole purpose of improving search rankings). Spammy links impact a site's "Trust Rank", and will ultimately hurt a site's search engine rankings if (when) Google finds out.

The bottom line: getting people to link to your website is a much more valuable strategy. It shows that your website is a major hub and houses important content. In this respect, getting linked to from high-traffic sites with high search rankings is a definite victory.

So how do you "bait" people to link to you? One tip is to tweak content according to the what's hot right now. Customizing blog posts according to time of year (seasons, holidays) or newsworthiness is one way to create unique, compelling posts that can easily hook an audience. Evan commented that people tend to love Top 10 lists and superlatives. Another method is to leverage your online network and asking for links, although this seems to be a time consuming approach (sending out 20 emails, only getting 1 positive response), and perhaps not always appropriate for science (you don't want to come across as too self-promoting, and you might turn people off).

3) Convert people from viewers to buyers/followers


As your website grows, the final step is to convert casual visitors into captive followers--capitalizing on your web traffic and the people that come to you via search engine terms and external links.

To illustrate the power of Search Engine Optimization, Evan relayed a case study from his own life. His wife had always dreamed of being a theme park designer, but needed to find a way to break into the industry. She realized that theme parks are a small, niche industry where not many people are web-savvy; like science, the theme park industry also has what Evan calls "Microcelebrities" - people who are well known within their own specialized community. To get her name out there in the theme park world, Evan's wife set up a website called Entertainment Designer and started producing content that was unique but relevant for the theme park community. In particular, she sent "Interview Request" e-mails to micro celebrities, and started posting the resulting transcripts on the site. This unique content led to steady growth in site traffic over a relatively short timespan (1.5 years from web setup to interview posts). Entertainment Designer ultimately became a hub for the theme park industry, and nowadays Evan's wife acts as an intermediary who offers formal introductions to people who hope to work together in the industry. Through SEO and targeted content, the website enabled her to get her foot in the door, without having any training or previous experience.

The best advice for gaining followers? I think that every case is different, and requires some degree of trial-and-error. But general rules include knowing your target audience AND your chosen medium, and defining what you hope to accomplish in advance.

Conclusions

The internet continues to evolve, and there are some notable trends on the horizon. Google continues to personalize and customize search results--you'll notice your search results will be different depending if you're signed in or out of your Google account. When you're signed in, your search results will be affected by 1) your location and 2) your personal search history. This increasing reliance on personalized search will have significant implications for SEO strategies.

Perhaps the biggest secret of all is how to create content which is interesting to people--Evan firmly stated that frequent but boring, run-of-the-mill content isn't worth writing. If you take home one message from this post, take this: Content is King.

To find out more about SEO and social media strategies, I definitely recommend reading Evan Bailyn's books. I just finished Outsmarting Social Media, which we received for free at that talk, but another one I want to read is Outsmarting Google: SEO Secrets to Winning New Business.

Thursday, January 24, 2013

SMBE Meeting on Eukaryotic -Omics: April 29-May 2 at #UCDavis

The website is built, speakers have been lined up, and we're ready to announce it to the world:



Myself, along with my former PI Kelley Thomas at the University of New Hampshire, received funding from the Society for Molecular Biology and Evolution to host an SMBE Satellite Meeting focused on Eukaryotic -Omics at UC Davis this spring. The meeting dates have been set as April 29-May 2, 2013, and the meeting description is as follows:
The SMBE Satellite Meeting on Eukaryotic -Omics will bring together an interdisciplinary pool of researchers to discuss current efforts, challenges, and future directions for high-throughput sequencing approaches focused on microbial eukaryotes (environmental studies of non-model organisms). The meeting program will encompass investigations of eukaryote biodiversity, ecology, and evolution, using approaches such as rRNA marker genes, shotgun metagenomics, metatranscriptomics, and computational biology tools and software pipelines.
See the meeting website (http://www.smbe.org/eukaryotes/) for program announcements, registration details, and travel award information. We're currently in talks to tack on a QIIME workshop at the end of the meeting (tentative dates May 2-4), so keep an eye our for further details. The official conference hashtag will be #SMBEeuks on Twitter.

STEM diversity has been on my mind a lot lately, particularly given the Eisen lab's obsession with equality in gender representation. So I'm very excited to announce that our call for travel award applications includes a heavy focus on diversity--encouraging early-career applicants as well as those from underrepresented groups. Deadline for abstract submission and travel grant applications is Feburary 22, 2013 - mark it on your calendars!


Wednesday, January 16, 2013

Our #asm2013 Session: "Phylogenomics and Microbial Species Concepts"

The preliminary program is out for the 2013 meeting of the American Society for Microbiology, to be held May 18-21 in Denver, Colorado.

I'm very to excited to announce two awesome sessions being led by the Eisen lab. First is the session I'm co-convening, entitled "Phylogenomics and Microbial Species Concepts" (session dates and description below). The second session is "Citizen Microbiology: Enhancing Microbiology Education and Research with the Help of the Public", led by Jonathan Eisen and David Coil.

The abstract submission deadline just been extended to Thursday, January 17th - consider submitting to these two awesome sessions if you're vying for a talk!



Sunday, January 13, 2013

Navigating (and drowning in) the flood of PLoS ONE journal articles

I love PLoS ONE--both the mission of the journal and much of the science that is published there--and for the large part I love the new website redesign. But one thing I'm definitely not feeling is the revamped e-mail alert system.

I will admit it up front: I still abide by some old skool methods for discovering relevant literature. Every week, I pour through the Table of Contents and early article alerts from my favorite journals,  neatly delivered to my inbox. Twitter also helps me find a lot of literature, but I find it to be more of a stochastic and unpredictable method (particularly for weeks where my time for social media is limited due to a heavy workload or lots of travel). Plus, being on Pacific Time puts me off kilter with the rest of the world--relevant information is very easily buried in my Tweet stream, even on days when I am looking. So I stick to my e-mail alerts to make sure I don't miss any exciting new science.

Up until a month or so ago, the PLoS ONE e-mail alerts were a behemoth, but they were manageable. The HTML e-mail was nicely formatted with embedded links to a list of articles in fairly specific subject areas, such as "Marine and Aquatic Sciences" and "Evolution and Ecology". It would take a couple of minutes to scan through these sub-categories, but for the most part it was a pretty good way to filter out the research areas which were most certainly not relevant to you. Also, many articles were placed into multiple categories, so an environmental metagenome study using novel analysis methods would be listed under the subject headings for "Computational Biology" and "Evolution and Ecology".

So much to my dismay, I've been going through my holiday-induced backlog of journal alerts and was horrified by the new format for PLoS ONE Table of Contents:


The subject headings that were formerly useful for me have now been completely condensed into the very broad subject headings "Biology and Life Sciences" and "Environmental Sciences and Ecology". Worse, each of these subheading seems to contain a ridiculous number of articles (I didn't count how many, but I was scrolling for a looooong time before I reached the end of the subsection). And it also seems like I need to be looking through both of the above-mentioned subject headings: there were a few relevant articles peppered amongst lots of non-useful literature in each subheading.

I don't have the time (nor do I want to make the time) to scroll through lots of irrelevant scientific literature essentially looking for a needle in a haystack. So I took the advice of the yellow banner and went to create a custom alert on the PLoS ONE website. Frustratingly, there is no way just to look for new articles within in a defined subset of subject areas. You have to include a search term, which immediately narrows your search window. I tried just doing a simple search for "metagenomics", but I was getting a lot of biomedical/clinical articles amongst the interesting ones (and I didn't want to scroll through all 963 articles). Plus, I'm paranoid that my search term isn't catching all the articles that I would want to see. I tried filtering down the articles to a more manageable set, but my attempts did not go over very well:


My final gripe is the subject categories themselves. The checklist of subject terms initially presented under the Advanced Search function is different from the larger list of subject terms listed under "filter by subject area". Are the "filter by subject area" search terms defined based on the articles themselves? I have no idea. I also have no idea what half of the subject terms mean. There's a subject term called "Sequence Analysis" and also one called "Research and Analysis methods" - could/should an article overlap these two terms, or are they referring to two distinctly different things? In my mind these categories seem a bit too vague and redundant to be much use for users. Subject terms also have some glaring errors--there is no "Ecology" category at all!

In the end, I basically gave up. I'm going to begrudgingly go back to that monster of an e-mail Table of Contents. 

I'm a firm believer in intuitive web interfaces with powerful user functionality--I don't think any researchers should have to work this hard to complete what is essentially a very simple (and very common) task. It also in the best interest of journals (and authors) to have their work easily accessible--the articles I'm downloading today may result in future citations, blog posts, social media sharing, etc.

So my pleas for PLoS ONE:

  • Bring back the old e-mail alert format! Or even better, a new revamped format with even more useful subject categories.
  • Consolidate and streamline the subject terms - make them consistent between search interfaces, and specific enough that the meaning of each term is obvious. Ideally, each article would have something like Mendeley tags that would function as searchable keywords. If I liked a particular article, there would be a way to view articles with similar keywords--kind of like "Customers who bought item X also looked at these items" on Amazon.com

I know these type of fixes won't necessarily be easy - I don't know how PLoS ONE organizes its article databases, and the things I'm suggesting might require a significant amount of coding and/or manual curation to implement. But I do think this type of organization is imperative for the long-term business model of the journals. Keep the scientists happy!