Institution: International Institute of Information Technology, Hyderabad
“To talk about an information overload in the modern life sciences is an understatement.” –Nature Biotechnology. The world of Bioinformatics is made up of only 4 letters- A, T, G and C. And yet these simple letters constructs the most complex of the species known till date, i.e. Homo sapiens. The objective of this mini project is to design a small database-driven web application along-with a search tool for finding a user defined stretch of sequence in the user uploaded data file (preferably FASTA file). We aim to design an application that would count the number of nucleotides- single nucleotide, di-nucleotide, and tri-nucleotide from a user uploaded FASTA file. We would also be transforming the uploaded data into the corresponding codons that it accounts for with respect to the different Open Reading Frames. The next part of this project would concentrate on searching a particular stretch of sequence, technically called as Restriction Site, which is supplied as user input. The final propaganda of our project would be to evaluate the probability of finding CpG islands in the sequence.