Skip to Main Content

IBDP for BIOLOGY: Research Data

Guide for the Internal Assessment Individual Project in IB Biology

Databases

This page has links to research data that can be used for your Individual Project in the field of:

DNA, Amino Acid, SNPs, & Genes
Environmental Data from Satellites
Migrating Marine Mammals
Ocean Research
Ornithology - eBird Database
Paleobiology, Paleocology, and Paleomammalogy

You can find datasets from many disciplines, including environmental and social sciences, as well as government data and data provided by news organizations, in:

Google Dataset Search

Environmental Data from Satellites

  GOES Project Science

NOTE: Message posted on the site: "GONE OUT OF SERVICE / This site has stopped producing GOES images.." This site is still a source for many links to historical GOES data.

Migrating Marine Animals

 Oceans of Data

ODI (a division of Education Development Center (EDC)) provides students access to data collected by migrating marine animals, Earth-orbiting satellites, and drifting buoys, as well as a set of data analysis tools that can be used to take measurements and look for patterns in these complex data sets.

HomeOcean Tracks

Ocean Tracks provides access to authentic data collected by migrating marine animals, drifting buoys, and satellites, along with tools that allow you to display and analyze these data to investigate current and important scientific questions about animal interactions with the ocean environment. The Ocean Tracks website was created by EDC's Oceans of Data Institute and Stanford University.

 Tagging of Pelagic Predators 

Contains oceanographic datasets of marine life observation. By combining data from a diverse number of highly migratory species, and overlaying them with oceanographic data, it is possible to glimpse the processes that influence how open ocean ecosystems work.

 

Paleobiology, Paleocology, and Paleomammalogy

 Neomap (Neogene Mammal Mapping Portal) 

A common access portal for two different, free-standing databases in the field of paleomammalogy. These two linked datasets provide point-occurrence data for all published late Oligocene through Holocene mammals in the USA, and for many Quaternary localities in Canada.

Neotoma Paleoecology DatabaseNeotoma Paleoecology Database

An online hub for data, research, education, and discussion about paleoenvironments. Users can find information efficiently by searching the database on spatial, temporal, and metadata criteria; interactively browse and visualize live data and metadata; get data and information in a variety of useful formats (e.g., downloads, reports, graphics).

 

 The Paleobiology Database

A public resource for paleontological data, providing global, collection-based occurrence and taxonomic data for organisms of all geological ages, as well data services to allow easy access to data.

The Cornell Lab of Ornithology

The Cornell Lab of Ornithology is a leader in the study, appreciation, and conservation of birds. The Cornell Lab of OrnithologyThrough their programs they aim to advance the understanding of nature and to engage people of all ages in learning about birds and protecting the planet.  They host the eBird databse, in collaboration with organizations, regional experts, and users ("eBirders") all over the world.

eBirdeBird is the  world’s largest biodiversity-related citizen science project, with more than 100 million bird sightings contributed each year by eBirders around the world. eBird data document bird distribution, abundance, habitat use, and trends through checklist data collected within a simple, scientific framework. Birders enter when, where, and how they went birding, and then fill out a checklist of all the birds seen and heard during the outing. Access this database by creating an account with a username and password.  eBird includes population data from The Great Backyard Bird Count, maps of citizen-created bird habitat from Habitat Network. bird songs and calls from Macaulay Library, nest camera data from NestWatch, and sightings at bird feeders from Project FeederWatch.  These citizen science projects at the Cornell Lab of Ornithology provide a way for people to learn about birds, habitat, science, and conservation while contributing to real scientific studies.

Great Backyard Bird Count Great Backyard Bird count

Participants are asked to count birds for as little as 15 minutes (or as long as they wish) on one or more days of the four-day event and report their sightings online at birdcount.org. Each checklist submitted during the GBBC helps researchers at the Cornell Lab of Ornithology and the National Audubon Society learn more about how birds are doing, and how to protect them and the environment we share.Citizen scientists count for as little as 15 minutes in their own backyards to help Audubon expand their understanding of birds.

Habitat Network Habitat Network is a citizen science project designed to cultivate a richer understanding of wildlife habitat, for both professional scientists and people concerned with their local environments. They collect data by asking individuals across the country to literally draw maps of their backyards, parks, farms, favorite birding locations, schools, and gardens. Participants then create new habitats for birds, then update their maps. The goal is to help people make better decisions about how to manage landscapes sustainably.

 Macaulay Library 

An extensive scientific archive of natural history audio, video, and photographs. Although the Macaulay Library is best known for its collection of bird songs and calls, the collection also includes amphibians, fishes, and mammals, and the library preserves recordings of each species’ behavior and natural history.

 Nest Watch

Volunteers help scientists at the Cornell Lab of Ornithology by collecting data on the successes and failures of nesting birds by tracking nests, clutches, broods, and fledglings. NestWatch is a nationwide monitoring program designed to track status and trends in the reproductive biology of birds, including when nesting occurs, number of eggs laid, how many eggs hatch, and how many hatchlings survive.

 BirdSleuth is the K-12 education program of the Cornell Lab of Ornithology. They take an inquiry-based approach to science curriculum that engages kids in scientific study and real data collection through the Cornell Lab of Ornithology’s citizen-science projects. Look through BirdSleuth's science curriculum resources for teachers for Internal Assessment project ideas.

DNA, Amino Acid, SNPs, & Genes

You can find DNA sequences, amino acid sequences, SNPs (single nucleotide polymorphisms). genes, and other related databases in the links below.   Most are from the National Center for Biotechnology Informationpart of the U.S. National Library of Medicine.

 Blast

The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

 dbSNP

Database of single nucleotide polymorphisms (SNPs) and multiple small-scale variations that include insertions/deletions, microsatellites, and non-polymorphic variants.

 Genes

This database integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.

 Genetic Testing

This curriculum unit from the Northwest Association for Biomedical Research explores how bioinformatics is applied to genetic testing. Specifically, the bioinformatics tools of BLAST and Cn3D are used to investigate the genetic and molecular consequences of a mutation to the Breast Cancer Susceptibility 1 (BRCA1) gene.

 Genome

This resource organizes information on genomes including sequences, maps, chromosomes, assemblies, and annotations.

 Nucleotide 

The Nucleotide database is a collection of sequences from several sources, including GenBank (an annotated collection of all publicly available DNA sequences), Reference Sequence (RefSeq - a collection of sequences, including genomic DNA, transcripts, and proteins), TPA (tissue-type plasminogen activator) and PDB (Protein Data Bank).

 Protein

The Protein database is a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt (a protein sequence database), PIR (Protein Information Resource), PRF (Protein Research Foundation database of amino acid sequences of peptides and proteins), and PDB. Protein sequences are the fundamental determinants of biological structure and function.

 RSCB Protein Data Bank 

This resource is powered by the Protein Data Bank archive-information about the 3D shapes of proteins, nucleic acids, and complex assemblies that helps students and researchers understand all aspects of biomedicine and agriculture, from protein synthesis to health and disease. The Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) curates and annotates PDB data. The RCSB PDB builds upon the data by creating tools and resources for research and education in molecular biology, structural biology, computational biology, and beyond.

Ocean Research

NOAA Data in the Classroom NOAA Data in the Classroom

NOAA Data in the Classroom makes data from several federal environmental agencies available online for use in studying El Niño, sea level, water quality, ocean acidification, and coral bleaching.

 Northwest Association of Networked Ocean Observing Systems (NANOOS)

Google Dataset Search

Google Dataset Search Google Dataset Search

Google's Dataset Search platform enables users to find datasets stored across the Web through a simple keyword search. The tool surfaces information about datasets hosted in thousands of repositories across the Web, making these datasets universally accessible and useful.

Google believes that this project will have the additional benefits of a) creating a data sharing ecosystem that will encourage data publishers to follow best practices for data storage and publication and b) giving scientists a way to show the impact of their work through citation of datasets that they have produced.

As more dataset repositories use schema.org and similar standards to describe their datasets, the variety and coverage of datasets that users find in Dataset Search, will continue to grow.