ICT5253 Cloud Architectures and Solutions Case Study 2 Sample
Assignment Brief
Case Study
Ancestry, the global leader in family history and consumer genomics, uses sophisticated engineering and technology to help everyone, everywhere discover the story of what led to them. The company has spent more than 30 years innovating and building products and technologies that at their core, result in real and emotional human responses. Ancestry currently serves more than 2.6 million paying subscribers, holds 20 billion historical records, 90 million family trees and more than four million people are in its AncestryDNA network, making it the largest consumer genomics DNA network in the world. The company's popular website, ancestry.com, has been working with big data long before the term was popularised. The site was built on hundreds of services, technologies and a traditional deployment methodology.
"It's worked well for us in the past," says Paul MacKay, software engineer and architect at Ancestry, "but had become quite cumbersome in its processing and is Jme-consuming. As a primarily online service, we are constantly looking for ways to accelerate to be more agile in delivering our solutions and our products."
The company is transitioning to cloud naive infrastructure, using Docker containerisaJon, Kubernetes orchestration and Prometheus for cluster monitoring. "Every single product, every decision we make at Ancestry, focuses on delighting our customers with intimate, sometimes life-changing discoveries about themselves and their families," says MacKay.
"As the company conJnues to grow, the increased producJvity gains from using Kubernetes has helped Ancestry make customer discoveries faster. With the move to DockerisaJon for example, instead of taking between 20 to 50 minutes to deploy a new piece of code, we can now deploy in under a minute for much of our code. We've truly experienced significant Jme savings in addition to the various features and benefits from cloud naive and Kubernetes-type technologies."
Since its introduction a decade ago, the Shaky Leaf icon has become one of Ancestry's signature features, which signals to users that there's a helpful hint you can use to find out more about your family tree. So, when the company decided to begin moving its infrastructure to cloud-naive technology, the first service that was launched on Kubernetes, the open-source platform for managing application containers across clusters of hosts, was this hint system. Think of it as Amazon's recommended products, but instead of recommending products the company recommends records, stories, or familial connections.
"It was a very important part of the site," says Ancestry sogware engineer and architect Paul MacKay, "but also small enough for a pilot project that we knew we could handle in a very appropriate, secure way." And when it went live smoothly in early 2016, "our deployment Jme for this service was cut down from 50 minutes to 2 or 5 minutes," MacKay adds.
"The development team was just thrilled because we're focused on supplying a great experience for our customers. And that means features, it means stability, it means all those things that we need for a first-in-class type operation."
Your Tasks:
Write a 2000-word report discussing about the organisations’ cloud computing architecture and how Kubernetes can help the organisation solve their current cloud app deployment problems. Specifically, your report should contain the following elements:
1. An ExecuJve Summary of the report (200 words)
2. A brief IntroducJon about the organisaJon (100 words)
a. Introduce the organisaJon
b. Brief discussion of the problem(s) they are facing
c. Overview of the recommended soluJon
d. Discussion about the organisaJons’ cloud compuJng architecture (1000 words) – you can make reasonable assump0ons
(i) Pla]orm (if with cloud provider, e.g., Amazon Web Services (AWS), Microsog Azure, Google Cloud)
(ii) Networking components (e.g., virtual networks, subnets, routers, and load balancers)
(iii) Storage and databases (e.g., object storage, file storage, relaJonal databases, and NoSQL databases)
(iv) Application and services (e.g., web applications, APIs, microservices, or serverless functions) Security and compliance measures (e.g., firewalls, encryption mechanisms, identity and access management (IAM), and compliance frameworks)
e. Discussion about Kubernetes and how it will help to manage and deploy applications in the cloud (200 words)
f. Discussion about the technological, legal, and ethical issues related to handling of sensitive data in the cloud, including emerging trends such as serverless cloud and green cloud computing (400 words)
g. Conclusion (100 words)
h. References (minimum of 5 academic references)
Solution
1. Introduction to the Organization
a. Introduction to Ancestry
Ancestry is the world’s largest family history company and leading consumer genetic company. Millions of people use Ancestry each month to discover their heritage. Ancestry’s family of brands has over 2.6 million subscribers, 90 million family trees, and the world’s largest consumer DNA network, all powered by advanced technology to find customers.
b. Problem Statement
Ancestry’s legacy systems are inefficient, slow, and cumbersome to deploy. The scalability and agility of the business unit are governed by traditional infrastructure, which reduces customer experience.
c. Recommended Solution
To achieve efficiency, Ancestry, with the help of Docker, Kubernetes, and Prometheus, chose cloud native infrastructure in such a way that it automatically scales, deploys faster, and has been more automated while keeping the security and compliance intact.
d. Cloud Computing Architecture
(i) Cloud Platform
Ancestry has migrated to a cloud native infrastructure on which they use Amazon Web Services (AWS) as their cloud provider. The main reason is that AWS is scalable, reliable, secure and that it has a lot of managed services which support modern application architectures for university assignment help .To manage containerized applications, EC2 instances for compute power, AWS Lambda for serverless computing,
.png)
Figure 1: Amazon AWS
(Source: Abu et al., 2024)
Ancestry has migrated to AWS by means of Elastic Kubernetes Service (EKS) (Alhaidari et al., 2023). Security is ensured through individual network segments by Amazon Virtual Private Cloud (VPC), and access controls are enforced by AWS Identity and Access Management (IAM).
Also, Amazon S3 is a scalable storage for billions of historical records and DNA data, and Amazon RDS supports structured genealogy databases. Despite enabling the CloudFront CDN, the user experience gets accelerated globally. Most of the deployment time had been slashed down, responsiveness increased, and the fault tolerance also increased. Being features built in for security, such as encryption, certified under compliance (GDPR, HIPAA), and automation equipped with tools such as CloudTrail, CloudWatch, AWS protects Ancestry’s infrastructure (Alghofaili et al., 2021). With Ancestry modernizing its IT infrastructure through AWS’s complete cloud ecosystem, users can process data much faster, can analyze data in real time, and can scale to millions of users without a hitch.
(ii) Networking Components
A well architected networking framework is used to secure, scale, and provide for good data flow on Ancestry’s cloud infrastructure shared with Amazon Web Services (AWS). This networking setup is based on Amazon Virtual Private Cloud (Amazon VPC) which is a logically isolated environment for running applications and databases.
Subnets are used within the VPC in order to segment out workloads across multiple Availability Zones, improving fault tolerance. Internet-facing services like web applications are held on some of the public subnets, while the rest of the public subnets are used for non-internet-facing services like databases and backend services, as well as sensitive data handling, now for more secure hosting (Bonati et al., 2021).
AWS Route 53 acts as a highly available Domain Name System (DNS) service, enabling seamless routing of traffic to different services within Ancestry's infrastructure. Internet Gateways and NAT Gateway allows viewers to have control over internet access in different subnet types and keep the policy safe.
.png)
Figure 2: Elastic Load Balancer
(Source: Hamdan et al., 2021)
Ancestry uses Elastic Load Balancers (ELB), because they automatically route traffic to the appropriate servers, striving for high availability, and fault tolerance, for efficient distribution of traffic and to prevent downtime (Haji et al., 2021). It is used with both Application Load Balancers (ALB) for HTTP/HTTPS traffic as well as Network Load Balancers (NLB) for high performance, low latency applications.
It has AWS Security Groups and Network Access Control List (NACL) to control the inbound and outbound traffic. By using AWS Transit Gateway, developer can manage the network connectivity of many VPCs and can also communicate the services more easily.
(iii) Storage and Databases
Ancestry utilizes a cloud infrastructure that incorporates genealogical and genomic data and requires complex storage solutions utilizing both object and file storage, as well as relational and NoSQL databases.
In terms of object storage, Amazon EFS (Elementary File System) provides Ancestry with scalable, durable, and cost sensitive storage for containing historical records, DNA data, media files, and user-generated content (Rao, 2021). More so, S3 versioning and lifecycle policies enable efficient data management, ensuring data retention, as well as compliance with business and other legal regulations.
As for file storage, Ancestry employs EFS to allow jumping CPU cores and shared access to application files across multiple instances. This works particularly well with distributed workloads such as data processing pipelines and DNA sequence analyzing machine learning models.
.png)
Figure 3: PostgreSQL and MySQL
(Source: Klimek and Skublewska, 2021)
Ancestry utilizes PostgreSQL and MySQL engines on Amazon RDS to address structured genealogy data, user profiles, and transaction records in relational databases. Furthermore, Amazon Aurora, a high-performance relational database, is also utilized for intensive query processing and enhanced database performance (Klimek and Skublewska, 2021). For NoSQL databases, Amazon DynamoDB is essential for handling high velocity and low latency data operations pertaining to user activities and real-time anomaly search indexing and recommendation features, such as the Shaky Leaf hint system powered by Ancestry. Data warehousing for integrated analytics and complex dataset queries is provided by Amazon Redshift, which powers analytics for additional data sourced by Ancestry.
(iv) Applications and Services
.png)
Table 1: Application and Services
The table shows which are the key applications and services on Ancestry’s cloud infrastructure, and security and compliance. The primary user interface is exposed through the web application, where the developer can interact seamlessly with APIs and microservices (Lamothe et al., 2021). With AWS Lambda, there is no need for maintaining servers that are used for background processing, thus reducing server costs. IAM helps to secure data protection along with firewalls and the associated encryption protocols. Compliance is ensured by such frameworks as GDPR and even HIPAA on the part of Ancestry, safeguarding user privacy. As part of integrating these technologies, Ancestry can provide its customers with a secure, scalable, and efficient platform.
e. Kubernetes and Its Benefits
Ancestry’s journey to cloud-native infrastructure is still underway, it has relied on Kubernetes to simplify its deployment, management, and scaling of the software applications used to serve its customers. By being an open-source container orchestration platform, Kubernetes helps Ancestry to deploy the services in multiple cloud environments in such a way that high availability and resiliency are maintained (Shamim et al., 2022). The biggest plus point of Kubernetes is the ability to automate scaling. It’s about the concourse job that runs the servers that assemble archives, gather data, and run the database, all of which are assembled from Ancestry's architectural record, being that it handles large amounts of genealogical and DNA data.
The efficient use of resources is another advantage. Since Ancestry deploys applications in containerized environments with Kubernetes, their server use is optimized, which in turn lowers infrastructure cost while keeping performance in check. Rolling updates and canary deployments ensure that the new features and updates are executed smoothly without disturbing the user experience (Nocentino et al., 2021). Kubernetes also makes the deployment of workloads across private and public clouds a bit easier, which fits well for Ancestry, which can seamlessly deploy workloads across multiple clouds for multi-cloud and hybrid cloud deployments.
f. Technological, Legal, and Ethical Issues
.png)
.png)
Table 2:Challenges and Emerging Trends in Cloud-Based Data Handling
Ancestry handles sensitive amounts of personal and genetic data, and the handling of data in the cloud creates its own technological, legal, and ethical problems. Data security is one main issue related to technological matters. Strong encryption of transit and at rest is needed to protect DNA records and family histories. The risks remain very much the same: data breaches, insider threats, and all of those remaining misconfigurations (Banerjee, 2023). Also part of this is the need for data sovereignty since cloud services are spread across numerous countries, and therefore, they must be in compliance with jurisdictional laws. Migrating from a vendor such as AWS can be complex and costly.
From a legal standpoint, Ancestry has to toe the GDPR and HIPAA line and how personal data is collected, stored, and processed with user consent. There is also a data ownership issue as users have to maintain control over their genetic information (Hassan, 2025). Ethical risk, privacy concerns, and data misuse risks. It could also leak or lead to unauthorized access, and thus, genetic discrimination or identity theft.
g. Conclusion
The transition of Ancestry to using cloud native infrastructure has greatly helped scalability, security and efficiency. An architecture is built around the AWS services that is flexible and robust and supports seamless data processing for millions of users. With security features like IAM and encryption, sensitive user data can be protected, and Kubernetes’s deployment automation helps, so the deployment is automated. GDPR and HIPAA compliance address legal and privacy concerns, however, it ensures that proper handling of data is conducted guided by ethical standards. Though there are problems such as data sovereignty or misuse, Ancestry is still savvy in these new technologies and is taking full advantage of them, starting with serverless and green cloud computing.
References
Abu-Jassar, A.T., Attar, H., Amer, A., Lyashenko, V., Yevsieiev, V. and Solyman, A., 2024. Remote Monitoring System of Patient Status in Social IoT Environments Using Amazon Web Services (AWS) Technologies and Smart Health Care. International Journal of Crowd Science, 8.https://www.researchgate.net/profile/Hani-Attar/publication/384687664_Remote_Monitoring_System_of_Patient_Status_in_Social_IoT_Environments_
Using_Amazon_Web_Services_AWS_Technologies_and_Smart_Health_Care/links/67653ebe117f340ec3cf7074/
Remote-Monitoring-System-of-Patient-Status-in-Social-IoT-Environments-Using-Amazon-Web-Services-AWS-Technologies-and-Smart-Health-Care.pdf
Alghofaili, Y., Albattah, A., Alrajeh, N., Rassam, M.A. and Al-Rimy, B.A.S., 2021. Secure cloud infrastructure: A survey on issues, current solutions, and open challenges. Applied Sciences, 11(19), p.9005.https://www.mdpi.com/2076-3417/11/19/9005
Alhaidari, F., Rahman, A. and Zagrouba, R., 2023. Cloud of Things: architecture, applications and challenges. Journal of Ambient Intelligence and Humanized Computing, 14(5), pp.5957-5975.https://link.springer.com/article/10.1007/s12652-020-02448-3
Banerjee, S., 2023. Challenges and Solutions for Data Management in Cloud-Based Environments. International Journal of Advanced Research in Science, Communication and Technology, pp.370-378.https://hal.science/hal-04901406/
Bonati, L., D'Oro, S., Polese, M., Basagni, S. and Melodia, T., 2021. Intelligence and learning in O-RAN for data-driven NextG cellular networks. IEEE Communications Magazine, 59(10), pp.21-27.https://ieeexplore.ieee.org/abstract/document/9627832/
Haji, S.H., Zeebaree, S.R., Saeed, R.H., Ameen, S.Y., Shukur, H.M., Omar, N., Sadeeq, M.A., Ageed, Z.S., Ibrahim, I.M. and Yasin, H.M., 2021. Comparison of software defined networking with traditional networking. Asian Journal of Research in Computer Science, 9(2), pp.1-18.http://papers.sendtopublish.com/id/eprint/137/
Hamdan, M., Hassan, E., Abdelaziz, A., Elhigazi, A., Mohammed, B., Khan, S., Vasilakos, A.V. and Marsono, M.N., 2021. A comprehensive survey of load balancing techniques in software-defined network. Journal of Network and Computer Applications, 174, p.102856.https://www.sciencedirect.com/science/article/pii/S1084804520303222
Hassan, N.A.B., 2025. Managing Data Dependencies in Cloud-Based Big Data Pipelines: Challenges, Solutions, and Performance Optimization Strategies. Orient Journal of Emerging Paradigms in Artificial Intelligence and Autonomous Systems, 15(2), pp.20-28.https://orientacademies.com/index.php/OJEPAIAS/article/view/2025-02-10
Klimek, B. and Skublewska-Paszkowska, M., 2021. Comparison of the performance of relational databases PostgreSQL and MySQL for desktop application. Journal of Computer Sciences Institute, 18, pp.61-66.https://ph.pollub.pl/index.php/jcsi/article/view/2314
Klimek, B. and Skublewska-Paszkowska, M., 2021. Comparison of the performance of relational databases PostgreSQL and MySQL for desktop application. Journal of Computer Sciences Institute, 18, pp.61-66.https://ph.pollub.pl/index.php/jcsi/article/view/2314
Lamothe, M., Guéhéneuc, Y.G. and Shang, W., 2021. A systematic review of API evolution literature. ACM Computing Surveys (CSUR), 54(8), pp.1-36.https://dl.acm.org/doi/abs/10.1145/3470133
Nocentino, A.E., Weissman, B., Nocentino, A.E. and Weissman, B., 2021. Kubernetes architecture. SQL Server on Kubernetes: Designing and Building a Modern Data Platform, pp.53-70.https://link.springer.com/chapter/10.1007/978-1-4842-7192-6_3
Rao, M.V., 2021. Data duplication using Amazon Web Services cloud storage. In Data Deduplication Approaches (pp. 319-334). Academic Press.https://www.sciencedirect.com/science/article/pii/B9780128233955000069
Shamim, S.I., Gibson, J.A., Morrison, P. and Rahman, A., 2022. Benefits, Challenges, and Research Topics: A Multi-vocal Literature Review of Kubernetes. arXiv preprint arXiv:2211.07032.https://arxiv.org/abs/2211.07032



81 Isla Avenue Glenroy, Mel, VIC, 3046 AU

