decor
 

Planet Big Data logo

Planet Big Data is an aggregator of blogs about big data, Hadoop, and related topics. We include posts by bloggers worldwide. Email us to have your blog included.

 

March 23, 2019

Cloud Avenue Hadoop Tips

Tips and Tricks for using K8S faster and easier

In this blog I will list down the Tips and Tricks around usage of K8S making it easier and faster to use. Some of these are extremely useful during the Certifications also. Usually I add them to my .bashrc file.

I will try to continuously update this blog as I get across many more. Also, if you come across any additional tips, let me know in the comments and I will add them here.

1) Create a shortcut for kubectl, this will save a few key strokes.
alias k='kubectl'
2) Adding this to .bashrc makes the auto completion work with the alias 'k' mentioned above. This makes it faster to complete the commands.
source <(kubectl completion bash | sed s/kubectl/k/g)
3) This deletes the namespace and creates it again. This makes sure that all the objects in the namespace are cleanup. This makes K8S faster as things the objects don't get piled over time.
alias kc='k delete namespaces my-namespace;k create namespace my-namespace' 
Once this is done, the default namespace has to be changed. Note to change the current context (kubernetes-admin@kubernetes) in the second command based on the output of the first command.
kubectl config current-context
kubectl config set-context kubernetes-admin@kubernetes --namespace=my-namespace
4) The watch command is used a lot in K8S to observe the the different objects getting created and destroyed (object life cycle). When we run 'watch k get pods' the 'k' alias doesn't get expanded unless the below is included in the .bashrc.
alias watch='watch '
5) A bunch of them are mentioned here at 'Pimp my Kubernetes Shell'.

6) YAML is better than XML in verbosity. But, typing YAML is still a pain. The below commands will create the sample YAML files that can be modified later on as per our requirement. Note that the actual K8S objects are not created, just the YAML files as the --dry-run option is used.
k run nginx --image=nginx --restart=Never --dry-run -o yaml > pod.yaml
k run nginx --image=nginx -r=2 --generator=run/v1 --dry-run -o yaml > rc.yaml
k run nginx --image=nginx -r=2 --dry-run -o yaml > deployment.yaml
k expose deployment nginx --target-port=80 --port=8080 --type=NodePort --dry-run -o yaml > service.yaml
That's it for now. Have fun with K8S.
 

March 22, 2019


Forrester Blogs

Make No Mistake — Microsoft Is A Security Company Now

Microsoft has announced support for macOS in its rebranded Microsoft Defender ATP product, taking this product from being an offering that could be considered an add-on for hardening its own...

...

Forrester Blogs

Thoughts On DX: Headless Vs. AI/ML

I have been noodling on the future of digital experience (DX) for the better part of six years. Today I want to run a hypothesis past you: Headless architecture and AI/machine-learning (ML)...

...
Cloud Avenue Hadoop Tips

Exposing a K8S Service of Type LoadBalancer on Local Machine

As we had been exploring in this blog, there are different ways of installing and using K8S. It can be in the Cloud, In-Premise and also on your Laptop. For the sake of trying different things I prefer installing it on the Laptop as I would have complete control over it (building and breaking) and also also that I can work offline. And there is no cost associated with it. But, there are a few disadvantages of it, like not being able to integrate with the different Cloud services. We will look at one of the disadvantage of installing K8S on the Laptop and a workaround for the same.

When we create Pods then an IP address is assigned to them and when a Pod goes down and a new one is created by the Deployment automatically, then the Pod might be assigned a different IP address. So, when the IP address is not static then how do we access them? This is where a K8S Service comes into play. More about the K8S Service and the different ways they can be exposed here, here, here and here.

Based on how we want to expose a Service depending on the access type, Service can be exposed using Type as a ClusterIP, NodePort or a LoadBalancer as mentioned here. When a Service of Type LoadBalancer is created, then K8S will automatically creates a Cloud vendor specific Load Balancer. This is good in the Cloud, but what happens when we create a Service of Type LoadBalancer in a non-Cloud environment as in the case of Laptop where a Load Balancer can't be provisioned automatically. This can be addressed using MetalLB which is described as a 'bare-metal load-balancer for K8S'.

So, lets get into the sequence of steps to get around this problem using Metallb.

Step 1: Create a file nginxlb.yaml with the below content. This basically creates a Deployment and a Service for nginx.
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1
ports:
- name: http
containerPort: 80

---
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
ports:
- name: http
port: 8080
protocol: TCP
targetPort: 80
selector:
app: nginx
type: LoadBalancer
Create another file layer2-config.yaml with the below content. This creates a ConfigMap with the IP address range pool which can be assigned to the Load Balancer. Note that the IP address range has to be modified according to the network to which your Laptop has been connected to.
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: my-ip-space
protocol: layer2
addresses:
- 192.168.0.30-192.168.0.40
Step 2: Create a Deployment and Service using the 'kubectl apply -f nginx' command. Notice that the EXTERNAL-IP is specified as <pending> because in the local setup a Load Balancer cannot be created as in the case of Cloud.


Step 3: Delete the Deployment and the Service as they will be installed later again with MetalLB setup completed using the 'kubectl delete -f nginxlb.yaml' command.


Step 4: Install the MetalLB and the related components using 'kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.7.3/manifests/metallb.yaml' command.


Make sure that the corresponding Pods are created using the 'kubectl get pods -n=metallb-system' and also that they are in the Running status after a few minutes.


Step 5: Apply the configuration for the MetalLB installation using the 'kubectl apply -f layer2-config.yaml' command. In this file we specify the range of IP address from which the Load Balancer will get an IP address. Don't forget to change the IP address in the file, based on the network the Laptop is connected to.


Step  6: Now that MetalLB setup has been done. Recreate the Deployment and the Service using 'kubectl apply -f nginx.yaml'. Notice in the output of the 'kubectl get svc' command, the EXTERNAL-IP is 192.168.0.31 and not <pending> as was the case earlier.


Step 7: Now that the Load Balancer has been created, open the 192.168.0.31:8080 URL in a browser to get the below nginx page. Note to change the IP address based on the above step. Now, we are able to setup a Service of Type set to LoadBalancer if we get the below page.


Step 8: Finally delete the nginx Deployment/Service and the MetalLB related components using the 'kubectl delete -f nginxlb.yaml' and the 'kubectl delete -f https://raw.githubusercontent.com/google/metallb/v0.7.3/manifests/metallb.yaml' command. This is an optional step, just in case you want to delete it.

 

Conclusion


By default in a non-Cloud environment when we create a service with Type set to LoadBalancer the EXTERNAL-IP is set to <pending> as a Load Balancer cannot be created. But, in this blog we have seen how to configure a Layer 2 Load Balancer (MetalLB) on the Laptop to get a Load Balancer created. Below is the screen shot before and after the MetalLB setup has been done. Note the value of the EXTERNAL-IP.

Before

After
 

March 21, 2019


Forrester Blogs

Amazon’s Haven Is An Incubator, Not A Disruptor

Haven, the recently named joint venture of Amazon, Berkshire Hathaway, and JPMorgan Chase, has officially emerged from the shadows with a public persona, giving industry pundits new details about the...

...

Revolution Analytics

AI, Machine Learning and Data Science Roundup: March 2019

A monthly roundup of news about Artificial Intelligence, Machine Learning and Data Science. This is an eclectic collection of interesting blog posts, software announcements and data applications from...

...

Forrester Blogs

Getting On The S/4 Bandwagon (Or Not)

Historically, SAP has not been as renowned for innovation as some of its pure SaaS competitors, like Salesforce. But between a steady stream of SaaS (and now data) acquisitions, a notable and ongoing...

...

Forrester Blogs

Backstage Pass: Three Key Takeaways From The 2019 RSA Conference Without Hitting The Expo Floor

Every year, the RSA Conference provides an opportunity to attend a few keynotes and get together with friends, old and new, to discuss trends we’re seeing in the market. While a big theme on the expo...

...

Forrester Blogs

How To Build A Better (Digital) Mousetrap

“Build a better mousetrap, and the world will beat a path to your door.” While Ralph Waldo Emerson may not have said these exact words, the principles remain as true as ever today. Going...

...
Cloud Avenue Hadoop Tips

Installing Istio and Bookinfo Application on K8S

In this blog previously we explored the different ways of getting started with K8S using Play-With-Kubernetes, Minikube and finally easily installing K8S using a combination of VirtualBox and Vagrant.

Now we will try to install Istio on top of the existing K8S setup and install a Microservices based application called Bookinfo on top of it. Once the setup of Istio is done, we should be able to explore the different features of Istio. The Bookinfo Application is a polygot application using Python, Java, Ruby and NodeJS. The various Microservices and how there interact is detailed here.

So, what is service mesh and what is Istio? Istio is an Open Source implementation of service mesh. While K8S provides orchestration of the Containers, Istio is used for the management of the services created by these Containers in the Microservices based Architecture. More about Istio and service mesh here (1, 2, 3). As we explore the different features of Istio in the upcoming blogs, it will be more clear what service mesh is all about in the context of Microservice based architecture.

Istio is not the only implementation of service mesh as mentioned here. Google uses Istio, while AWS uses App Mesh. Both of them are built on top of Envoy proxy.

Lets jump into the installation of Istio and the Bookinfo Microservices based application on top of it. We would be following the steps mentioned here and here.

Step 1: Download Istio using the 'curl -L https://git.io/getLatestIstio | ISTIO_VERSION=1.1.0 sh -' command. It will create a 'istio-1.1.0' folder with the below structure.


Step 2: Install the Istio CRD using the `for i in install/kubernetes/helm/istio-init/files/crd*yaml; do kubectl apply -f $i; done` command.



Step 3: Install the Istio binaries using the 'kubectl apply -f install/kubernetes/istio-demo-auth.yaml' command.



Step 4: Verify the Istio installation using 'kubectl get svc -n istio-system' and 'kubectl get pods -n istio-system' commands. All the services should be created and the pods should be in a Running or Completed status below.

Now we are done with the Istio setup. Note there are a couple of different ways of installing Istio, but this is the easiest way.



Step 5: Run the below commands to create a namespace called 'my-namespace' and make it the default namespace in the current context. The current context name 'kubernetes-admin@kubernetes' in the third command has to be modified based on the output of the second command.

a) kubectl delete namespaces my-namespace;kubectl create namespace my-namespace
b) kubectl config current-context
c) kubectl config set-context kubernetes-admin@kubernetes --namespace=my-namespace


Step 6: Istio sidecar can be injected into the application manually or automatically. We will look at the automatic way. Label the namespace with the 'kubectl label namespace my-namespace istio-injection=enabled' command. With this label, any application deployed in this namespace will have Istio sidecar injected into it automatically.


Step 7: Deploy the Bookinfo application using the 'kubectl apply -f samples/bookinfo/platform/kube/bookinfo.yaml' command. Make sure to be the Istio folder as shown below.


Check the status of the pods (kubectl get pods), they should be in the Running status after a few minutes.


To confirm that the application is running. Make a call to the Bookapp webpage from one of the pod. using the below command.

kubectl exec -it $(kubectl get pod -l app=ratings -o jsonpath='{.items[0].metadata.name}') -c ratings -- curl productpage:9080/productpage | grep -o "<title>.*</title>"

The output should be as shown below. This means that the application has been deployed successfully. Note that is says 2/2 in the above screenshot. Why is it so? One is main application container and the other is Envoy proxy container injected by Istio. Without the label on the namespace it should say 1/1 as the Envoy proxy container is not injected by Istio.


Step 8: An overlay network is created by default and the Bookinfo webpage is only accessible within this network and not from the outside. For this we have to use port forwarding using the 'kubectl port-forward --address 0.0.0.0 pod/productpage-v1-6b6798cb84-v6l7p 1234:9080' command. In this the pod name should be changed, which can be got by running the 'kubectl get pods' command.


Instead of using port forwarding we should have used Istio Gateway as mentioned here, But, this automatically creates a load balancer which is not possible in the local machine, but only on the Cloud. So, we are using port forwarding as mentioned above.

Step 9: Now the Bookinfo webpage can be accessed from the browser (192.168.0.101:1234/productpage). Note that the IP address has to be modified to match the IP address of any of the node in the K8S cluster.


Step 10: Use the below commands to cleanup the Bookinfo application and Istio.

samples/bookinfo/platform/kube/cleanup.sh

kubectl delete -f install/kubernetes/istio-demo-auth.yaml
for i in install/kubernetes/helm/istio-init/files/crd*yaml; do kubectl delete -f $i; done




Conclusion


In this blog we looked at the required steps for installation Istio and then Bookinfo application on top of K8S. It's not too difficult to install Istio as mentioned above, but the Cloud vendors providing managed K8S make it even easier with a single click installation of Istio or any other service mesh.

In the future blogs, we will explore the different features of Istio in a bit more detail using the Bookinfo or some other application, this will make it clear what Istio and service mesh is all about.
 

March 20, 2019


Forrester Blogs

Happy Agents Deliver Great Customer Service — Here’s How To Do It

Happy customer service agents mean happy customers — and happy shareholders. Engaged agents also have better job performance, are more productive, and stay in their jobs longer — which is one of the...

...

Forrester Blogs

FORRward: A Quick Weekly Read For Tech And Marketing Execs

It’s Time To Start Talking About Professional Licensing For Software Engineers It wasn’t until enough boilers blew up that mechanical engineers had to get licensed. It wasn’t until bridges and...

...

Forrester Blogs

Accessible Pizza And Other Highlights From CSUN 2019

Domino’s Pizza suffered a major blow recently: A federal appeals court reversed a lower court ruling* and affirmed that the Americans with Disabilities Act (ADA) does cover the Domino’s websites and...

...
 

March 19, 2019


Forrester Blogs

Digital Myth No. 5 — Every Tech Vendor Is An Expert In Digital Transformation

While every tech vendor seems to lay claim to being an expert in digital transformation, it stands to reason that not all can be. For sure, there are many vendors with experience helping clients...

...

Forrester Blogs

What’s Going On With Online Grocery

Last week, IRI hosted its annual Growth Summit in Denver (where some hapless attendees were snowed in by an unfortunate blizzard on the second day). For anyone not familiar with IRI, it is one of the...

...

Forrester Blogs

Shapeshift Your Retail Offer With Software-Defined Stores

We’ve all learned about the advantages of software-defined infrastructure. We don’t think twice about network or desktop virtualization. But what about using software to define and remotely manage...

...
 

March 18, 2019


Forrester Blogs

Data Management Platform Coverage Takes On A New Look

When I started my career in marketing analytics almost 20 years ago, the biggest challenge was wrangling first- and third-party data, joining them together, and analyzing customer patterns. It was...

...
Cloud Avenue Hadoop Tips

Quickly and easily installing K8S on the local machine

In the previous blog here we have seen how to get started with K8S easily with zero-installation using Play-With-Kubernetes (PWK). Everything happens in the remote machines, so nothing to install on the local machines. We can get started with K8S in less than 5-10 minutes using PWK. The main con of PWK is that the session is available for 4 hours and any modifications to the K8S cluster are lost.

One easy way to use K8S locally is to use Minikube as mentioned here, but it provides a single node cluster and it makes it tough to test the different failure scenarios like a node going down and a few other things.

In this blog we will try to install a multi-node K8S locally on the laptop as mentioned here, so that the changes are persisted across sessions. We should be able to continue from where we left. K8S-the-hard-way sets up a cluster from scratch, but it takes time and expertise. So, there are tools like kubeadm which abstracts and makes the installation process easier.

With kubeadm there are a sequence of steps to install a multi-node cluster on laptop. And for those who are new to Linux, it might be a pain. So, I was trying to figure out if the installation process using kubeadm can be automated using Vagrant. Tried a couple of hours, got stuck and gave up. And then luckily I found a ready made Vagrantfile from this article, which made the K8S installation process a breeze.

On a side node a multi-node K8S cluster can be run on the Cloud, but not every one is comfortable with the Cloud, so here are the steps using VirtualBox and Vagrant on the local machine.

Step 1: Download and install the latest version of VirtualBox and then Vagrant. For the sake of Vagrant, you might have to restart the OS. The installation is pretty obvious as installing any other Windows software.

Step 2: Make folder on the laptop and create a Vagrantfile with the content from here. If required the amount of Memory and CPU cores can be modified in this file.

Step 3: Go to the above created folder and run the command 'vagrant up' from the Command Prompt. It takes a couple of minutes to create Virtual Machines in VirtualBox, download and install the K8S and the required binaries. The end screen will appear as shown below.


And the Virtual Machines (k8s-head, k8s-node-1 and k8s-node-2) will appear as shown below. We are all set with the K8S installation. It's a piece of cake. It had never been easy to install softwares.


Step 4: K8S follows a master-slave architecture. Login to the master using 'vagrant ssh k8s-head' and run the 'kubectl get nodes' to make sure all the nodes are ready.



Step 5: Now lets create a deployment using the 'kubectl run nginx --image=nginx -r=4' and make sure it has been deployed using the 'kubectl get deployment' and 'kubectl get pods' commands.




Step 6: Now if we want to destroy the cluster, run the 'vagrant destroy -f' command from the earlier created folder and the Virtual Machines will be shutdown and deleted.


Step 7: During the installation if something goes wrong then it will be displayed on the screen and more details will be logged to 'ubuntu-xenial-16.04-cloudimg-console.log' file in the same folder.


As seen above all it takes is a couple of steps to create a multi node K8S on the laptop. Now you should be all setup to get started and explore the world of K8S. Further nodes can be added by modifying the Vagrantfile and running the 'vagrant up' command.

In the upcoming blogs, we will try to install additional packages or applications on top of the above K8S cluster and try different things with them.

Note: Joserra in the comments points to the K8S blog on the same here. This blog uses VirtualBox and Vagrant. While the K8S blog uses Ansible to run the commands in the VM on top of VirtualBox and Vagrant. The end result of both of them are the same.

Forrester Blogs

The French And UK Customer Experience Index Results Are In!

Each year, based on a survey of your customers, Forrester releases results of its Customer Experience Index (CX Index™). The methodology measures how well a brand’s customer experience...

...
 

March 17, 2019


Forrester Blogs

The Future Of Department Stores

Department stores can enjoy a bright future — but to do so, they must change to survive and thrive in the face of fierce competition from brands’ direct-to-consumer initiatives. Synchronizing supply...

...
 

March 15, 2019


Simplified Analytics

HOW DIGITAL IS CHANGING THE WORKPLACE? BY SANDEEP RAUT Via BTN

My Article published in The Business Transformation Network (BTN) a thought leadership network, which allows experts in their field to showcase knowledge through the dissemination of ideas...

...
 

March 14, 2019


Forrester Blogs

Choosing The Right NoSQL Product

NoSQL is more than a decade old! Only a few years ago, we were talking about how NoSQL was still maturing and its ecosystem still evolving. Today, NoSQL is used by more than half of large companies...

...

Forrester Blogs

Creative Media Outshines The Technology

The excess of adtech and martech such as demand-side platforms (DSPs) and data management platforms (DMPs) gives marketers all the control and insights required to run efficient and effective media...

...
 

March 13, 2019


Forrester Blogs

Qualtrics X4 2019: What Do Sir Richard Branson, Barack Obama, And Oprah Winfrey Have To Say About Creating Better Experiences?

I was at Qualtrics’ X4 Summit in Salt Lake City last week. The two-day event featured a heavy-hitting mainstage lineup, including President Barack Obama, Sir Richard Branson, Ashton Kutcher, Adam...

...

Forrester Blogs

The Forrester Wave™: Managed Security Services In Asia Pacific, Q1 2019 Identifies The 11 Most Important Vendors

CISOs in Asia Pacific must justify their spending and articulate the business value of often expensive investments in managed security to a largely non-security audience of executives. Currently,...

...

Forrester Blogs

Memo To Cheating Parents: You Are The Problem

Forrester does not write about college admissions. We don’t cover scandals, political, social, or celebrity. But for some reason I can’t keep my mouth shut on today’s hot topic:...

...
 

March 12, 2019


Revolution Analytics

R 3.5.3 now available

The R Core Team announced yesterday the release of R 3.5.3, and updated binaries for Windows and Linux are now available (with Mac sure to follow soon). This update fixes three minor bugs (to the...

...

Forrester Blogs

Interoperability And The Empowered Consumer Set The Stage At HIMSS19

The largest healthcare event, the Healthcare Information and Management Systems Society’s (HIMSS) conference, highlights the key themes that will be most influential for the industry in the coming...

...

Forrester Blogs

Choose The Right Enterprise Architecture Management Tool Using Forrester’s Latest EAMS Wave™

Today’s enterprise architecture (EA) practices are relevant because they enable a firm’s customer-led and digital transformations. As more traditional EA use cases become commoditized, vendors that...

...

Forrester Blogs

The Security Snapshot: Cybersecurity And Privacy In 2019 — Prepare For The New; Protect The Established

Introducing our new monthly blog series, “The Security Snapshot,” which will curate and highlight key pieces of research from the security and risk (S&R) team. Last week at RSA Conference,...

...

Forrester Blogs

The Insights Beat: Future-Proof Your Insights Capability

Last month, our featured research focused on building the foundation of an insights-driven business. This month, it’s time to start renovating, quickly. While every house needs a strong foundation,...

...
 

March 11, 2019


Forrester Blogs

OK, Zero Trust Is An RSA Buzzword — So What?

Last week was the annual RSA Conference. Estimates are that more than 45,000 security personnel, business professionals, and leaders attended the event, up from 35,000 last year. Regardless of the...

...

Forrester Blogs

Atoms Get Their Revenge At The Intelligent Edge

My colleagues J. P. Gownder, Craig Le Clair, and I just published the results of a year-long study to answer the question “What happens when digital business systems and physical-world processes come...

...
Ronald van Loon

Succeed in the Intelligent Era with an End-to-End Data Management Framework

The last decade has seen unprecedented advancements in artificial intelligence. We have moved towards a data-centric approach, and data is the center of everything digital. The data collected through different sources is refined, analyzed, and orchestrated with data platforms to generate intelligent insights that can facilitate the growth of any organization.

The spread of these data platforms, coupled with the advancements in artificial intelligence, enable what has come to be known as the intelligent era. Enterprises are now making smart decisions – backed by actionable insights that also give them the guidance they need for the future.

As an SAP partner, I was given the opportunity to explore the SAP Data Hub and get insights into the data management challenges organizations face in the new Intelligent Era.

What to Expect in the Near Future

With bigger horizons and a solid base to build on, we can expect organizations to settle down with data to be a more significant part of this intelligent era. The following points list what can be expected from the future:

  • Enterprises will be able to generate data from any business, any person, and any device. This will be augmented by the Internet of Things or IoT and the broader 3rd party data economy.
  • We can expect better applications that are focused on improving the customer experience among many other things.
  • Enterprises will be able to have access to information essential to their products and services. The sales sphere will be smarter than it is now, with many personalized tools and techniques, marketing to every customer out there.

Humans have surely done the tough part, and they need to be consistent to reap the fruits of AI. However, data management is a necessity that shouldn’t be compromised. The foundation of all intelligence in the future and the benefits we mentioned above lie in the provision of data management. Data is the center of the IT landscape.

Data that is properly structured and is duly managed from its arrival to when it is under process within analytic tools or business processes would generate the best insights and actions. The insights you generate to fuel your enterprise’s intelligence and to take smart decisions will only be as good as the data that you have in your hands. Poor quality data and the lack of the data leads to poor quality insights and bad decision-making. Thus, data management is a crucial facet of an intelligent environment that must not be ignored. Since it plays an important pivotal role in dictating and building the ecosystem of change, data itself requires stringent checking and management throughout the whole analytics and business process.

What Hinders Companies from Performing in the Intelligent Era

Having talked about the shiny side of the intelligent world, there are also some major implications that require due attention as well. The SAP State of Big Data Study, published a year ago, revealed important statistics related to the challenges faced by organizations going smart.

According to the results, almost 74 percent of enterprises felt that their data landscape/ecosystem is too complex for them to understand, so much so that it is believed to limit their productivity and innovation in a way. 86 percent believed they weren’t getting much value out of the data and 84 percent of all CEOs were extremely concerned about the quality of the data and how the insights generated through it could negatively impact decision-making in the future. This skepticism is further fueled by the fact that over $9.7 million are lost every year per average organization due to poor quality data and the resulting bad decisions made based on it.

Enterprises and partners are realizing the benefits and the promise of an intelligent ecosystem by focusing on data transparency, by leveraging existing investments, by building new data processes, and by transforming data landscapes. They do plan to incorporate machine learning and open source technologies for insights, but the challenges cannot be ignored. These challenges include:

Siloed Data

Siloed data is perhaps the biggest challenge that enterprises looking to go smart are facing. The presence of large sets of secluded data on public or private clouds such as Hadoop, Microsoft, Google, or Amazon Web Services makes it challenging for organizations to gather the data required for decision making. Data is no longer only the domain of Enterprise systems but it’s being created by devices, 3rd parties, and derived from a myriad of sources and therefore the processing of data also needs to reflect the new data landscape data stores connecting, refining, and utilizing wherever it resides.   

Most organizations want to run real-time analytics that gives them results and insights from their data on the go. But with these siloed platforms the data is scattered, inconsistent, and there is little potential for generating enterprise-wide insights in real-time. This concerns organizations and entrepreneurs as they try to search for potential options to help them solve this lack of available company-wide insights.

Data Volume

Due to the Internet of Things with literally billions of connected sensors and devices, mobile data, social data like Facebook, Snapchat and others, new data types arise almost daily.

The amount of data being generated and the need for digestion of this data volume is unheard of. Data volume, data generation velocity and data complexity increase to a level that has never been there before.

It is not only a challenge of completely new data types in structured and unstructured data sources that get generated, but it is also a challenge of volume.  

Data Migration and Integration

Data migration, or the integration of all data to a single data store is difficult to perform. The substantial increase in IoT data sources, volume and velocity makes it hard to migrate data to a single source or integrate each source with all analytics and applications. It is often the case that analysis and decisions have to be made on data at the point of capture, at the edge, or on data in transit. Data migration and integrations also require significant expertise in the field of data management, which is scarce.

The combination of these factors makes it complex and difficult to build an intelligent enterprise that uses data to the fullest potential. It is because of the complications involved here that 84 percent of all CEOs believe that their data landscape is particularly complex and requires simplification.

Data Quality

It is simply impossible to handle the amount and complexity without software support: be it data quality management all the way up to machine learning and artificial intelligence. To understand the true meaning of data, help is required.

We have seen companies are challenged by having diverse and separated data landscapes, they are challenged with growing and more complex new data types arising, they have a hard time to ensure data quality, to understand the full meaning of data, and now comes along another phenomenon to handle. Additionally, it is very challenging to ensure consistency of data quality and transformation across many different data silos as data cleansing and enrichment may be performed differently in each implementation without central governance and alignment.

The Need for an End-to-End Data Management Framework

Enterprises drowning in the implications of data need an end-to-end management platform. Managing data on several public and private clouds can be too much to handle. A good end-to-end data management system should:

  • Unify Data: An organization should be able to orchestrate and unify all their distributed valuable data from public and private clouds with an end-to-end data management framework. This applies to both the data itself but also to the metadata that describes the data consistently which is be easy to find, understand, and utilize.
  • Simplify Landscape: Perhaps the most important functionality of an end-to-end data management system should be that it simplifies the landscape and makes it easier for enterprises to view and understand their data and work on it in one central place. Business users and IT users need to be able to discover the data and metadata and understand the lineage or how that data was sourced and transformed. Hindrances should be reduced to a minimum.
  • Handle Complexity: With the landscape already simplified, an end-to-end data management system should also handle the complexity by combining data from disparate data sources and make  near real-time analyses easier. This enables faster development of new Analytics and decision processes through re-use, reduced redundancy, and making data before a service to users and processes. This simplification should also support the agility to adjust to new data, analytics, and adoption of new technologies to deliver the next generation of data driven innovations to the business. The ability to re-use existing data integration, data processing, and coding is a must – no customer wants to re-implement productive process and data flows but rather focus to driving new value based on net new innovations.
  • Centralize Governance: Data governance is a major concern for most managers, especially in the era of GDPR, which is why an end-to-end data management setup should centralize the governance and bring data access control over to the people that matter. Insight into of the metadata, profiles of data in the connected data stores, the data processes executed, and the end-to-end lineage on the interconnectivity is critical.  This driving understanding what data is available, defined, used, consumed, and if business and regulatory policies are being complied with.
  • Data Management: Finally, and perhaps most importantly, an end-to-end solution should help you in managing your data better. It should provide a single solution that gives you full visibility into your data without leaving its original home. Data should flow through a data pipeline where it is collected, orchestrated and cleansed so you do not need to worry about the poor quality of insights and decision-making. This is not limited to the data integration flow but also to the process operations that are executed on data as it flows across the landscape. This A data management framework should manage your data for all future needs and even trigger actions based on predefined parameters.

End-to-End Data Management Systems at Work

The SAP Data Hub provides an end-to-end data management system to augment the efforts of enterprises towards a smarter future. Furthermore, embedded Machine Learning Algorithms and the integration to IoT provide SAP Data Hub with the necessary functionality to enable its customers to become truly intelligent enterprises. Kaeser Compressors SE, a manufacturer of compressed air solutions, is a user of the SAP Data Hub. The manufacturer can now integrate IoT data with customer data and equipment information to generate close to real-time insights allowing the company to adjust their business processes on the fly resulting in a better productivity rate, more agile ticket handling/customer service, and ultimately saving costs for the company

The latest SAP Data Hub release provides customers with a lean deployment and easy installation option offering a complete containerized setup for decentralized data processing. Enhanced Metadata Management and Cataloging functionality that were introduced with the latest Data Hub release allow customers to quickly and easily get an overview of their data sources, under the linage of data and operations, define and manage these in one central place, which is critical when it comes to compliance topics like GDPR.  Additionally, SAP Data Hub is previewing new capabilities in integrate ABAP based systems into pipelines, integrate cloud data integration tools, and to integrate SAP Data Hub into BPM designed business processes with the wider data landscape.

SAP’s Hana business data platform carries all the benefits of an end-to-end data management platform mentioned above and builds together with SAP Data Hub the foundation for the Intelligent Enterprise. The HANA data platform together with the functionality enabled through SAP Data Hub helped Kaeser Compressors increase productivity and customer service by taking charge of the complexities involved in the data management process and delivering the insights generated through their data. The biggest benefit of SAP Data Hub that customers have reported is its ability to bring their structured and unstructured data together without the high costs of moving or replicating the data.

SAP’s Data Hub creates a foundation for all intelligent enterprises and augments the efforts of thought leaders in the creation of a smarter world.

Ronald

Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

More Posts - Website

Follow Me:
TwitterLinkedIn

Author information

Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

The post Succeed in the Intelligent Era with an End-to-End Data Management Framework appeared first on Ronald van Loons.


Forrester Blogs

“Alexa, Did Any Of My Clients Add Cash Last Week?”

Mobile technologies, artificial intelligence, and predictive analytics at firms such as Amazon, Netflix, and Pandora anticipate customer needs and improve customers’ daily lives. These firms...

...

Forrester Blogs

FORRward: A Weekly Read For Tech And Marketing Execs

There Is A Tech Talent Gap . . . No Wait, There Isn’t  Tech talent continues to be one of the most common debates in the industry today. Conventional wisdom says that there is a critical...

...
Cloud Avenue Hadoop Tips

Getting started with K8S the easy way using 'Play with Kubernetes'

There are many ways of installing K8S as mentioned here. It can be installed in the Cloud, on-premise and also locally on the laptop using virtualization. But, installing K8S had never been easy. In this blog, we will look at one of the easiest way to get started with K8S using Play with Kubernetes (PWK). With this the whole K8S experience is within the browser and there is nothing to install on the laptop, everything is installed on the remote machine. PWK uses 'Docker in Docker' which is detailed here (1, 2).

Step 1:  Go to https://labs.play-with-k8s.com/, Login and click on Start. A Docker or a Git login would be required for the same.


Step 2: PWK allows up to 5 nodes or machines. Click on 'ADD NEW INSTANCE' for 5 times and this will add 5 instances as shown below from node1 to node5. Here we will configure node1 as master and the remaining as workers.

Clicking on a node in the left pane will give access to the corresponding terminal in the bottom right pane. The combination 'Alt+Enter' will maximize the terminal.



Step 3: Run 'kubeadm config images pull' command on node1. This will pull all the images required for the installation before the actual installation starts in the next step. This is an optional step, but this step makes the installation faster.


Step 4: Init the master on node1 using the 'kubeadm init --apiserver-advertise-address $(hostname -i)' command. The output of the command should be as shown below. Note down the 'kubeadm join .....' command from the output of this command.



Step 5: Now is the time to deploy the Pod network using the below command on node1.

kubectl apply -n kube-system -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 |tr -d '\n')"


Step 6: Execute the 'kubeadm join ......' command on all the workers (node2, node3, node4 and node5). On each of the node the 'This node has joined the cluster' will be displayed towards the end of the output. The 'kubeadm join ......' command has been got from Step 4.



Step 7: After a few minutes run 'kubectl get nodes' on the master node (node1) and all the nodes should be in a Ready status. This makes sure that out 5 node K8S cluster is ready.


Step 8: Lets create a K8S Deployment with 4 replicas on the nginx server by running 'kubectl run nginx --image=nginx -r=4' on the master node (node1). Initially the status of the Containers will be in 'Container Creating. But, in a few seconds it will change to Running.



Step 9: Get the detailed status of the Pods using 'kubectl get pods -o wide' command. This sill show that the Pods are balanced across all the nodes.

The K8S Deployment objects maintains a fixed number of Pods. Delete one of the Pod using 'kubectl delete pod NAME-OF-THE-POD'. Notice that the Pod will be deleted and a new Pod is automatically created. This can be observed by running the 'kubectl get pods -o wide' command again. The name of the deleted Pod will be changed.


The K8S session would be available for 4 hours. And also any resources/setting done will be lost after the session. The changes to the cluster won't be persisted. Likewise there are a few disadvantages of using PWK, but the good thing is it's free and requires no installation on the local machine.

In the upcoming blogs, we will try to explore the other ways of installing K8S. Also, check Katakoda. It offers K8S in the browser similar to PWK.
 

March 08, 2019


Forrester Blogs

Without Strong Governance, Facebook’s New Platform Vision Will Actually Undermine Privacy

With this week’s announcement, it’s official: Facebook will automatically correlate and merge the personal data of users across its platforms. The announcement is not a surprise, but it will find...

...
 

March 07, 2019


Revolution Analytics

Where to find the worst weather in the US

Which US city has the worst weather? To answer that question, data analyst Taras Kaduk counted the number of pleasant days in each city and ranked them accordingly. For this analysis, a "pleasant"...

...

Forrester Blogs

Data And Analytics Leaders, We Need You!

How do you create an insights-driven organization? One way is leadership. And we’d like to hear about yours. Today, half of the respondents in Forrester’s Business Technographics® survey data...

...

Forrester Blogs

Goliath’s Challenge: DTC Startups Get Personal

In this battle between direct-to-consumer (DTC) startups — cast as the proverbial David — and their large, established brand opponents, victory is not about size or stature; it’s about skillful...

...

Forrester Blogs

Uncovering 2020 Digital Experience Priorities: It’s That (Survey) Time Of Year!

This week, we launched our annual panel survey on digital experience (DX) priorities, challenges, and strategies. If you’re a leader, decision maker, or influencer of your enterprise’s DX...

...
 

March 06, 2019


Forrester Blogs

Think Privacy’s Just A Cost Center? Think Again

Over and over, clients tell us they just don’t get enough funding for the kind of privacy programs that they want to create. In fact, many privacy budgets shrank in 2019 after firms were forced to...

...

Forrester Blogs

The Forrester Wave™: Information Archiving Cloud Providers, Q1 2019

(Caleb Ewald contributed to this blog.) The archiving market may be mature, but it is far from stagnant. In recent years, there has been active consolidation among cloud archiving providers. Smaller...

...

Forrester Blogs

Attendee Journey Highs And Lows At MWC19 Barcelona

I just returned from MWC19 Barcelona, for which I was a judge of the GLOMO Awards. As a Forrester analyst whose research includes service design, I paid extra attention to the attendee journey. I’d...

...
 

March 05, 2019


Forrester Blogs

What CIOs Can Learn From Fleetwood Mac About Marketing

Here at Forrester, we love all things music. Each office floor has a different musical theme, and all the rooms are named after popular musicians of the time. It’s not uncommon to have a team meeting...

...

Forrester Blogs

Webinar: Digital Business Transformation Accelerators

Almost every technology consulting firm, and many others besides, claims to be able to help your business transform into a digital business. Digital transformation has been used to describe...

...

Forrester Blogs

Countdown To Forrester’s CX Sydney 2019 Forum

Change The Game By Embracing Radical CX Innovation OK, so you’ve elevated and transformed your customer experience (CX) practices over the past five (or more) years. These practices have become more...

...
 

March 04, 2019


Forrester Blogs

Tech Titans Alphabet And Microsoft Are Transforming Cybersecurity

Last April, we outlined how the “Tech Titans” (Amazon, Google, and Microsoft) were poised to change the cybersecurity landscape by introducing a new model for enterprises to consume cybersecurity...

...

Forrester Blogs

FORRward: A Weekly Read For Tech And Marketing Execs

Innovation, Schminnovation The Wall Street Journal recently published an op-ed titled “Beers, Bras and Brats.” This is a story about the perils of businesses that rest on their laurels. Brands such...

...

Forrester Blogs

Electronic Bill Presentment And Payment (EBPP) Is Undergoing A Sea Change — Catch The Wave!

Fun fact: I’ve been researching billing for Forrester since 2012. But 2018 was a year unlike any other in my billing research. For the first time, the majority of my client inquiries on the...

...

Forrester Blogs

What To Expect At RSA Conference 2019: Cloud As Security Improvement And The Possible End Of The Infosec Gilded Age

I recently did a webinar with a few of my colleagues from the RSA Conference Advisory Board on precisely this topic, which you can find here. We tried to expose as much as we could of the fantastic...

...

Forrester Blogs

Mobile World Congress Offers Food For Thought

Food Tech figured prominently on the menu at Mobile World Congress 2019. Mobile operators, network equipment providers, device manufacturers and solution providers address challenges from farm,...

...
 

March 01, 2019


Forrester Blogs

AI Creates The CRISPR For Corporate DNA

Every company has a specific DNA. You may think DNA is the brand or values. But corporate DNA is what makes a company tick, not what the company is. Talent, operations, technology, and data make up...

...

Revolution Analytics

An architecture for real-time scoring with R

Let's say you've developed a predictive model in R, and you want to embed predictions (scores) from that model into another application (like a mobile or Web app, or some automated service). If you...

...

Forrester Blogs

What I Learned At The Global E-Commerce Leaders Forum In LA

I try to attend at least one Global E-Commerce Leaders Forum (GELF) event every year. It’s a great event for brand and retail leaders in charge of global strategy to catch up on the latest and...

...

Forrester Blogs

The Three Fs From MWC Barcelona

MWC Barcelona is a wrap. The biggest theme of the event was the breathless anticipation of the benefits of the 5G networks, which are in early deployment this year. But practically speaking, 5G won’t...

...
decor