Practical Synthetic Data Generation: Balancing Privacy and the Broad Availability of Data

Building and testing machine learning models requires access to large and diverse data. But where can you...
$149.45 AUD
$149.45 AUD
SKU: 9781492072744
Product Type: Books
Please hurry! Only 612 left in stock
Author: Khaled El Emam
Format: Paperback
Language: English
Subtotal: $149.45
10 customers are viewing this product
Practical Synthetic Data Generation: Balancing Privacy and the Broad Availability of Data by Emam, Khaled El

Practical Synthetic Data Generation: Balancing Privacy and the Broad Availability of Data

$149.45

Practical Synthetic Data Generation: Balancing Privacy and the Broad Availability of Data

$149.45
Author: Khaled El Emam
Format: Paperback
Language: English

Building and testing machine learning models requires access to large and diverse data. But where can you find usable datasets without running into privacy issues? This practical book introduces techniques for generating synthetic data--fake data generated from real data--so you can perform secondary analysis to do research, understand customer behaviors, develop new products, or generate new revenue.

Data scientists will learn how synthetic data generation provides a way to make such data broadly available for secondary purposes while addressing many privacy concerns. Analysts will learn the principles and steps for generating synthetic data from real datasets. And business leaders will see how synthetic data can help accelerate time to a product or solution.

This book describes:

  • Steps for generating synthetic data using multivariate normal distributions
  • Methods for distribution fitting covering different goodness-of-fit metrics
  • How to replicate the simple structure of original data
  • An approach for modeling data structure to consider complex relationships
  • Multiple approaches and metrics you can use to assess data utility
  • How analysis performed on real data can be replicated with synthetic data
  • Privacy implications of synthetic data and methods to assess identity disclosure


Author: Khaled El Emam, Lucy Mosquera, Richard Hoptroff
Publisher: O'Reilly Media
Published: 06/09/2020
Pages: 166
Binding Type: Paperback
Weight: 0.60lbs
Size: 9.19h x 7.00w x 0.35d
ISBN: 9781492072744

About the Author

Dr. Khaled El Emam is a senior scientist at the Children's Hospital of Eastern Ontario (CHEO) Research Institute and Director of the multi-disciplinary Electronic Health Information Laboratory, conducting academic research on synthetic data generation methods, and re- identification risk measurement, and he is also a Professor in the Faculty of Medicine (Pediatrics) at the University of Ottawa.

He is the founder, CEO, and President of Privacy Analytics. Khaled has been performing data analysis since the early 90s, building statistical and machine learning models for prediction and evaluation. Since 2004 he has been developing technologies to facilitate the sharing of data for secondary analysis, from basic research on algorithms to applied solutions development that have been deployed globally. These technologies addressed problems in anonymization & pseudonymization, synthetic data, secure computation, and data watermarking. He has (co- )written multiple books on various privacy and software engineering topics. In 2003 and 2004, he was ranked as the top systems and software engineering scholar worldwide by the Journal of Systems and Software based on his research on measurement and quality evaluation and improvement. Previously, Khaled was a Senior Research Officer at the National Research Council of Canada. He also served as the head of the Quantitative Methods Group at the Fraunhofer Institute in Kaiserslautern, Germany. He held the Canada Research Chair in Electronic Health Information at the University of Ottawa from 2005 to 2015, and has a PhD from the Department of Electrical and Electronics Engineering, King's College, at the University of London, England.

Lucy Mosquera has a bachelor's degree in Biology and Mathematics from Queen's University and is a current graduate student in the department of statistics at the University of British Columbia. During her time at Queen's, Lucy provided data management support on a dozen clinical trials and observational studies run through Kingston General Hospital's Clinical Evaluation Research Unit. Lucy has also worked on clinical trial data sharing methods based on homomorphic encryption and secret sharing protocols. At Replica Analytics, Lucy is responsible for developing statistical and machine learning models for data generation, and integrating subject area expertise in clinical trial data into synthetic data generation methods, as well as the statistical assessments of our synthetic data generation.

Dr. Richard Hoptroff is a long term technology inventor, investor and entrepreneur. Awarded a PhD in Physics by King's College London for his work in optical computing and artificial intelligence, in 1992, together with Ravensbeck, he founded Right Information Systems, a neural network forecasting software company which was in 1997 sold to Cognos Inc (part of IBM). He then worked as a postdoc at the Research Laboratory for Archaeology and the History of Art at Oxford University and in 2001, created Flexipanel Ltd, a company supplying Bluetooth modules to the electronics industry.

In 2010, he founded the Hoptroff London, with the aim to develop smart, hyper-accurate watch movements and create a new watch brand. In 2013 he established a new commercial category when he brought to market the first commercial atomic timepiece and atomic wristwatch.

Hoptroff has now leveraged his expertise in timing technology and software to develop a hyper- accurate synchronised timestamping solution for the financial services sector, based on a unique combination of grandmaster atomic clock engineering and proprietary software.


Returns Policy

You may return most new, unopened items within 30 days of delivery for a full refund. We'll also pay the return shipping costs if the return is a result of our error (you received an incorrect or defective item, etc.).

You should expect to receive your refund within four weeks of giving your package to the return shipper, however, in many cases you will receive a refund more quickly. This time period includes the transit time for us to receive your return from the shipper (5 to 10 business days), the time it takes us to process your return once we receive it (3 to 5 business days), and the time it takes your bank to process our refund request (5 to 10 business days).

If you need to return an item, simply login to your account, view the order using the "Complete Orders" link under the My Account menu and click the Return Item(s) button. We'll notify you via e-mail of your refund once we've received and processed the returned item.

Shipping

We can ship to virtually any address in the world. Note that there are restrictions on some products, and some products cannot be shipped to international destinations.

When you place an order, we will estimate shipping and delivery dates for you based on the availability of your items and the shipping options you choose. Depending on the shipping provider you choose, shipping date estimates may appear on the shipping quotes page.

Please also note that the shipping rates for many items we sell are weight-based. The weight of any such item can be found on its detail page. To reflect the policies of the shipping companies we use, all weights will be rounded up to the next full pound.

Related Products

Recently Viewed Products