Would you like to join our team? |
CASE HISTORIESUniversity of Cambridge FlyMine database project uses the web to liberate the fly!The computational power available to today’s researcher is producing ever large quantities of useful data, but this in turn is creating new problems when working co-operatively with other scientists. How is this new data to be integrated with existing information, often held on different systems in distant locations, and how is it to be made available for further analysis? The FlyMine project based at the Department of Genetics, University of Cambridge, was set up to address these issues, by building an integrated database of genomic, expression and protein data for Drosophila (common name, fruit fly) and Anopheles (common name, mosquito), and making this available to the worldwide research community via web and bioinformatics interfaces. The Project Leader, Dr. Gos Micklem, obtained funding from the Wellcome Trust for 5 years (subject to review in year 3), and put together a team of 7 software, IT and bioinformatic specialists to work with him. When it came to choosing the systems to run FlyMine on, the team was recommended by other researchers to talk to Cambridge Online, because of their experience and expertise. Clear Project Aims The Internet was to be used to enable a broad community to query the database, and the database schema and associated programs were to be freely available using an open-source repository. The ability of the systems to support this open-source structure was one of the IT requirements that the FlyMine team put to Cambridge Online from the beginning. Why they chose the
Cambridge Online solution Once the project funding had been agreed, it was a prerequisite that the IT component was put out to open tender. François Guillier, System Administrator for the project comments on the quality of the some of the tenders “We had 14 serious bids for the project’s IT solution, but many of these showed little understanding of what we were trying to do. There were a final 4 possible suppliers, and from those we selected a winner with price being one measure, but we also needed a supplier that was responsive and could provide the service we needed. We selected Cambridge Online as they offered the best value.” Keith Allen, Technology Consultant at Cambridge Online explained why their tender had the edge “Our solution was one of two HP (Hewlett-Packard) system configurations and was compared with others from Sun and Fujitsu. The clustering and sheer performance of the HP systems made them the manufacturer of choice, but it was Cambridge Online’s implementation that offered the right performance at the best price and together with our high service level made us the supplier of choice.” The initial pilot phase for the project was run on Intel servers running Linux, but the production system has been designed to support the demands of users worldwide and runs on HP AlphaServers running UNIX. The cluster comprises of two ES45 AlphaServers and one DS10 AlphaServer, clustered using TruCluster for Tru64 UNIX to give a shared file system across the machines. Data storage is provided by a HP StorageWorks SAN (Storage Area Network) with dual HSG fibre channel so that each clustered machine can see the same data file. Within the cabinet there is space for a storage array of 168 disks. The 5 year project will gradually catalogue and provide more data, and in this time the forward march of technology means that while the density of storage available on one disk increases, the price per MB of storage will undoubtedly decrease. Hence Cambridge Online is working with the FlyMine team to keep them updated on the progress of storage technology, and they are only purchasing a new tranche of disks when they are required. Thus the FlyMine team is ensuring they get the most from their storage spend. The HP systems are at the hub of the project, integrating locally produced data as well as data from other databases on systems located around the world. The challenge
of working within one of the Oldest Universities To maintain the high-quality of Cambridge Online supplied systems, the build process takes place within the dedicated workshop and test facility at their Cambridge Science Park headquarters. However, instead of shipping the two cabinets as they were on completion, they were carefully broken down in to pieces that were small enough to be hand carried up the staircase. The minor rebuild then took place within the confines of the small machine room at the Department of Genetics. The system was then commissioned. FlyMine delivers The software team on FlyMine will continue to develop new open-source tools and offer them for free, and this together with the underlying data on Drosophila and Anopheles will provide a powerful research tool to scientific users world-wide. Cambridge Online will continue their support of the FlyMine team as the project matures. |
|