Joe Cornish, Biological Sciences
“Adaptive Learning Neural Networks for Binding Site Search in Genomic Sequences”
Artificial neural networks (ANN) can be trained to become highly efficient pattern recognition systems capable of discerning non-linear features on complex backgrounds. For this reason, ANNs have been proposed frequently as suitable search tools for the identification of transcription factor (TF) binding sites in genomic sequences. Here we show that ANNs trained with the standard backpropagation algorithm have significantly lower search efficiency compared to standard weight matrix methods in TF binding site search. We observe that this is due to the ill-balanced nature of the search problem, which requires the identification of a small number of sites against a very large background. We propose a new algorithm, adaptive learning, based on a targeted sampling of the background during backpropagation learning. We validate this approach by cross-validation on an up-to-date collection of CRP sites from Escherichia coli against the original E. coli genome, a randomly generated genome and the genome of Paenibacillus sp. Our results demonstrate that adaptive learning of ANNs improves search efficiency for CRP against tested backgrounds. The general implications of these findings for machine learning approaches to binding site search are discussed here.
This work was funded, in part, by UMBC UBM program National Science Foundation DBI 1031420 and the UMBC Department of Biological Sciences.
When did you start conducting research at UMBC? How did you find a mentor and project to work on?
My first research experience was in the laboratory of Dr. Marie-Christine Daniel-Onuta in the Chemistry and Biochemistry department. I had been trying to find an undergraduate research opportunity and a TA I had at the time was a graduate student working in Dr. Daniel-Onuta's laboratory. The TA introduced me to the research projects in the laboratory and I became very interested and excited about the work. The TA suggested that I contact Dr Daniel-Onuta about summer research opportunities.
What did you know about your field/project when you started? How did you learn what you needed to know?
My first experience was in organic chemistry. When I entered into the lab I had just completed the first half of the organic chemistry sequence so I was still very inexperienced at the time. The majority of what I learned was from hands on experience working with another undergraduate in the lab and especially the graduate students and post-docs. In my current position, the laboratory of Dr. Ivan Erill, much of what I learned was again from hands on experience, especially with the critical guidance of Dr. Erill.
Who do you work with on your project? Other undergraduates? graduate students? faculty?
The project I presented at URCAD was part of an ongoing research project. While the initial work had been started by other graduate and undergraduate students, I began work on the project after they left. Currently I am working with Dr. Erill on the project. I would like to acknowledge the other undergraduates and graduate students working in the lab who have been helpful through conversations we have had during lab meetings and elsewhere.
How did you decide to present at URCAD?
The decision to present at URCAD was simple. Sharing the work you do is a critical part of the research process. It provides many opportunities to learn from others working in the same areas. Poster presentations also provide a great opportunity to learn and network as it gives you a chance to speak with people in person.
Was the application difficult?
The application process is one of the best I have experienced. The URCAD program is highly organized, and the requirements are very clear and concise. Applying is the easiest part of presenting at URCAD. Additionally things like the point-by-point checklists for abstracts and the pre-event sessions are helpful.
How did you know what to put on your poster?
This was not an easy task. While there is somewhat of a formula to follow when making posters it can be a challenge to optimize the material to properly portray the work. Additionally, Bioinformatics draws from many different areas and as such, will have audiences with different backgrounds. This is a very important consideration as you have to provide enough background on each topic with a relatively small poster space budget.
Were you nervous about explaining your work to so many people? How did it go?
I had presented before at a conference on this material so I had already worked out the "talk". I always enjoy URCAD as it gives me an opportunity to share my work with fellow undergraduates and undergraduate researchers.
Will you work in the lab during the 2011-2012 school year? How much time will you put in? Do you get paid for this? Academic credit?
I will be continuing research through the 2011-2012 year. Part of my time will be through the UBM program here at UMBC. This program, which is designed for mathematics and biology research cross training for undergraduates, has been a great experience and I am excited to continue my work through it. Additionally the UBM program provides a stipend and summer housing. I am also excited to continue working in Dr. Erill's lab.
What are your goals for after UMBC?
My goals are to pursue graduate education and to continue research. The field of synthetic biology is in it's infancy and I am very interested in exploring the applications of work I have been a part of to synthetic biology.
Would you suggest to other undergraduates that they find a research project?
Yes! My experience in undergraduate has done so much more for me than any course could. Research allows you to not only develop much deeper understandings of your science, but allow you to learn many career critical skills.
What else are you involved in at UMBC?
I tried to maintain activity in some campus organizations but I have dedicated my time to research for the past few years. One of the most positive experiences outside of research was through the Shriver Center. I was a volunteer and then a Service Learning Intern for a program called MS Swim. I participated in this program for over a year.