28 OCT 2018 · Welcome to the 2nd episode of Bytes of Bio, Identify the Identifier!
Tadaa~ It's the TATA box. Boy, if you found the first episode hard, wait until you listen to this one. Statistically, I can confirm that 50% of the hosts on the show didn't understand 80% of it.
Let’s go through the main points we have discussed!
Goal: Identify the start of a genetic sequence (TATA box) to better study it.
TATA box: a non-coding sequence that helps the RNA polymerase identify where to start transcription.
Naive Bayes (in detecting TATA box): takes into account the characteristics of a TATA box and calculate the possibility that a string of TAs is actually a TATA box.
Improvements to Naive Bayes:
Improve the prediction accuracy by perturbing the computed prior probabilities.
Take into account dependency between factors that determine a TATA box.
Reference links mentioned throughout the podcast:
Case study for this episode, Extensions of Naive Bayes and Their Applications to Bioinformatics, study by Raja Loganantharaj, University of Louisiana, Lafayette (https://drive.google.com/file/d/1QXQHr8iCwDOTdy-u055MvBDARYNjdESY/view)
TATA box (http://bio1151.nicerweb.com/Locked/media/ch17/17_08TranscripInitiation.jpg)
Explanation of Naive Bayes (https://monkeylearn.com/blog/practical-explanation-naive-bayes-classifier/)
Limited dependency (https://cdn-images-1.medium.com/max/800/1*cdvfzvpkJkUudDEryFtCnA.png)
If you like this episode please leave a rating or a review!
It gets easier... I promise... Just, strap yourself in for next week's...