Big data is the backbone of modern business, but before it can be used, it must be properly managed. Here is an overview of the ins and outs of data management.
Image: ipopba, Getty Images / iStockphoto
Every company in the world has to deal with data. From single-person LLCs to multinational corporations, data is everywhere and must be well managed to be an effective business tool.
However, data is not just customer records and other external information – employee records, network cards, salary data, and other forms of external and internal information are all included in the list of data to be managed.
It takes a lot of work to turn data into something useful. Without good management you can get duplicate records, incorrect information, wasted time and storage space and a host of other problems with poor organization. Digital data is much more complicated than paper, so it requires specialized skills to organize it.
Enter the world of data management. Here are the most important information about data management, including models, software, implementation and more.
SEE: All TechRepublic cheat sheets and smart manuals
What is data management?
There are as many ways to define data management as there are websites that are focused on it. DAMA International, a consortium of data management professionals, defines data management as “the development and implementation of architectures, policies, practices and procedures to effectively manage the information needs of an enterprise.”
In other words, data management is multidisciplinary and keeps data organized in a practical, usable way. At the most fundamental level, data management works to ensure that the entire amount of data from an organization is accurate, consistent, easily accessible, and well-protected.
Data management is not only a way to eliminate duplicates and standardize formats, but also lays the foundation for data analysis. Without good data management, analysis in the worst case is practically impossible and at best unreliable.
What does a complete data management model entail?
If the data management definitions and descriptions make your head spin a little, it’s understandable – there’s a lot that goes into the practice of data management.
DAMA International splits data management into 11 knowledge areas:
Data Management, that is the planning of all aspects of data management. This typically includes ensuring availability, usability, consistency, integrity, and security of data managed by an organization.
Data architecture, or the general structure of an organization’s data and how it fits into a broader business architecture.
Data modeling and design, which includes data analysis and the design, construction, testing and maintenance of analysis systems.
Data storage and operations, which deals with the physical hardware used to store and manage data.
Data security, which includes all elements of data protection and ensures that only authorized users have access.
Data integration and interoperability, which includes everything that has to do with the transformation of data in a structured form (i.e. in an organized database) and the work required to maintain it.
Documents and content, which includes all forms of unstructured data and the work needed to make it accessible to and integrated with structured databases.
Reference and master data, or the process of managing data so that redundancy and other errors are reduced by standardizing data values.
Data storage and company information, where data is managed and applied for analysis and business decision making.
metadata, which includes all elements of creating, collecting, organizing and managing metadata (data referring to other data, such as headers, etc.).
Data quality, where data and data sources are monitored to ensure that quality information is provided, integrity is maintained, and poor-quality data is filtered out.
All these elements must be included in a total data management model; if only one element is missing, a certain aspect of data management is complicated, if not completely damaged. For example, if you lose metadata management, you lose the ability to easily categorize data. Without data quality guaranteed, the entire data structure is suspected and analyzes become useless. Eliminating integration and interoperability would make it almost impossible to combine different forms of data into a usable whole.
How does data management fit into a larger big data model?
If an analysis model is the product made from the data of a company, then data management is the factory, the materials, the supply chain – all that is needed to make the product.
You can’t have a big data model without data management – if you try this, you would say that your messy desk is perfectly organized chaos in which you can find everything; over time you will certainly lose something important.
SEE: 60 ways to get the most value from your big data initiatives (free PDF) (TechRepublic)
Data management is a full lifecycle system that tracks data from the moment it is created until it is no longer usable. Data management tracks data from place to place, monitors the transition of data from one form to another and ensures that nothing important is omitted from a business analysis model.
In short, data management does not only fit in a big data model – it is the umbrella under which all big data falls.
What skills do data management professionals need?
The essential role that data play in the modern business world is unmissable. Big data professionals need specific skills that make good data management possible.
A data management team needs different people who are proficient in certain elements of the entire end-to-end management chain. The skills that a data management professional needs to be trained include:
General computer science: A qualified data management professional must be trained in the basics of computer science – they will spend a lot of time using basic skills to organize data.
Database programming: Some of the most important database languages in the world for data management are SQL, Python, R, Hadoop, XML and PERL. Make sure you learn at least one of these languages and become familiar with the corresponding database platforms.
BI / BA: Business intelligence (BI) and business analytics (BA) form the core of the reason why companies collect and organize data. Data management professionals must be able to understand the how and why of analysis.
Cloud computing: Data hosting can take up a lot of storage space. That’s why many companies are turning to the cloud to host, manage, and analyze their data. Skilled data management experts should be familiar with AWS, Microsoft Azure, Google Cloud, IBM Cloud, and other important platforms.
Learn machine: Data analysis, in particular the later stages such as predictive analysis and prescriptive analysis, make extensive use of machine learning technology to reduce the computer time needed to deliver results.
Data management certifications: Data management is a science in itself and there are various certificates that professionals in the field of data management can pursue. DAMA International offers the Certified Data Management Professional (CDMP) certification. Oracle, IBM and others also offer certifications.
Soft skills: The use of data requires a great deal of cooperation with non-IT departments to plan and implement big data strategies. Good writing, speaking and innovative thinking are essential skills for successful data management professionals.
Which data management software is available?
Data management cannot be done randomly – organizations must invest in a data management platform that can deliver all the results they need to manage and use data.
There are a number of data management platforms, each with its own unique functions and industries in which it fits. Some of the best platforms are:
Some platforms, such as the big data analysis software from Google Cloud, are not specifically built to do data management, but that doesn’t mean they can’t do it. In the case of Google Cloud, all necessary software is present, but it must be configured to function as a data management platform.
As with any large software platform, choosing the right platform can make a huge difference in the success of an organization right from the start. When making a decision about a platform, make sure your data management team has a good understanding of the type of data you have, how you want to host it, and what your data management goals are. Armed with that information, a data management team can make the best possible choice for the needs of their organization.
How can organizations get started with data management?
Planning a data management initiative may seem like a million and one piece, but don’t get stuck in the weeds: planning to integrate data management into your organization is just like any other business transformation project.
First, make sure that your data management initiative has a clear goal: for what purpose are you trying to organize your data? For example, a company that wants to use data to make internal changes has different data management needs than a company that wants to use its data to increase sales.
Once you have a certain goal, it’s time to think about what will be needed to make it happen. If your data exists entirely as unstructured files and documents, you will get a different starting point than an organization with large Hadoop databases filled with well-organized records.
Consider all possible needs: new employee allocation, new employees, training, software platforms, budget, timetable, the types of data that are already available, the types of data that are needed, and more. If you have all these elements in mind, you can help when you really start planning seriously.
SEE: 4 ways to help users get used to big data through training (TechRepublic)
Then it’s time to deploy your talent. Hire new employees, reassign people who will be working on your data management project and familiarize the team with your data management goals.
Once your data management team is installed, it’s time to start the planning phase. Beyond how the team is going to achieve its goals, this is when a data management platform is chosen, training can be followed, and the entire model begins to come together.
After this, your data management team must be well on the way to building, testing and implementing a complete data management model. If all these requirements are met and data management is an integrated part of your business, it’s time to think about what’s coming: how all that well-organized data can help transform your organization, internally and externally.
The entire process of building a data management system can take a long time, and even then, data management is only the basis for further use of big data.
Data management is not a goal in itself: it is the house in which the data of an organization lives. It is up to that organization to use the house that it has built by putting that data to work.
Best of the week newsletter
Our editors emphasize the TechRepublic articles, galleries and videos that you absolutely should not miss to stay up to date with the latest IT news, innovations and tips.