OPTIMIZING CLASSIFIERS FOR PROTEIN SECONDARY STRUCTURE PREDICTION
Abstract
Protein secondary structure prediction is important for understanding protein structure and function. PSSP can be seen as a bridge between amino acid sequence and 3D structure of a protein. Many methods have been performed to improve prediction accuracy rate and get good achievement. There are multiple situations that will affect the performance of a method. One of these situations is selection of correct parameter. Hyperparameters are parameters that cannot be directly learned from the regular training process. Although the methods have default hyperparameter values, it is possible to improve performance of methods by using those hyperparameters with different values which can be more convenient. Parameter optimization plays an important role at this stage. It applies to methods to find best hyperparameter values to apply methods. In our thesis, computational methods such as Random forest, Support vector machines and deep convolutional neural fields have been used and optimized on CB513 dataset. We have aimed to optimize methods with different values to improve the results and show the importance of parameter optimization in protein structure prediction. We also tried to use some ensemble methods to compare our results with individual classifiers to see the improvement of results.