Thursday, December 18, 2014

7 Predictions for 2015


7 Predictions for 2015

By Scott Hedrick

December 18th, 2014

1. Big data crosses the chasm 


Hadoop and NO SQL production deployments will grow significantly among large organizations as they gain experience and assemble solutions from commercial vendors. Companies leveraging increasing amounts and types of data will start to reap significant business advantage over companies that are laggards.


2. Mainstream enterprises accelerate their move to SaaS and hosted environments

As large organization become more comfortable about the security of the cloud, SaaS will continue to replace legacy on-premise apps at an increasing rate.  The rapidly maturing offerings and tools make data storage, processing and analytics in hosted environments start to get into mainstream adoption among large organizations as companies look to increase efficiency and agility.
3. IoT + big data

IoT reaches the early majority as companies collect all the data from millions of connected devices in #Hadoop data lakes.  The value will increase as this IoT data starts to be combined with existing enterprise and customer data to take the value of analytical insights to the next level.


4. AWS gets greater competition


AWS usage will continue to grow rapidly as large enterprises movement to Cloud and SaaS accelerates, but it dominance will be challenges and they will increasingly have to respond to competition.  Microsoft Azure is picking up steam and I expect Microsoft give more serious competition to AWS in 2015. Google continues to invest, but will struggle to gain marketshare against AWS without some aggressive moves. 


5. More big data companies  go IPO


On the heals of successful IPOs by Hortonworks #HDP and NewRelic #NEWR, 2015 will see them followed in what could become a stampede of big data companies rushing to go public.  This will fuel the industry to invest in the next innovations and startups. 


6. Mobile market increasingly resembles PCs


Market shares and profit splits between Android and iOS globally are looking increasingly like Windows and MacOS for PCs.  While Android has a nearly dominating marketshare, Apple has enough, especially of the high end users that spend more, that it can hold it’s own and will continue to attract outsized investment and innovation.  Like PCs, a majority of smart phone profits will go to Apple with most other players struggling to profit from increasingly hard to differentiate products. 


7. Privacy concerns rise for IoT


Consumer privacy concerns start to rise significantly in connected consumer devices, especially smart TVs and cars. Governments outside the Europe will continue to lag in regulation and consumers will become increasingly concerned about how their private behavior is being collected and analyzed.  This could come head as private data from homes and cars is used is revealed to be used not only in court to prosecute, but by governments and black hat hackers in the real world.

Wednesday, December 17, 2014

Accelerate Time to Production with Informatica’s DataStax Enterprise Connector for Cassandra Data

This blog post was originally posted on the Datastax website on December 17th, 2014 here:

http://www.datastax.com/2014/12/accelerate-time-to-production-with-informaticas-datastax-enterprise-connector-for-cassandra-data

Accelerate Time to Production with Informatica’s DataStax Enterprise Connector for Cassandra Data

BY SCOTT HEDRICK -  DECEMBER 17, 2014 | 0 COMMENTS
Scott leads the Big Data partner ecosystem at Informatica. He works with key partners to bring great integrated joint solutions to market and promote them to the world. Prior to Informatica, Scott has driven success through marketing, ecosystems, partnerships and product leadership at companies including Nokia, Opera Software, MontaVista Software, Sun Microsystems and mobile startups. Scott helped pioneer the market for Linux and and web technologies for smartphones and internet devices, such as Nintendo Wii, Sony Internet TVs and Motorola phones.
To enable large organizations to more easily integrate DataStax Enterprise (DSE) into their production data pipelines, we have been working with our friends at DataStax, to add connectivity to data stored in Cassandra.  DataStax Enterprise (DSE), built on Apache Cassandra™, delivers a production and enterprise-ready version of Cassandra enabling Internet Enterprises to compete in today’s high-speed, always-on data economy. The connector enables companies to use Informatica Big Data Edition, PowerCenter or Vibe Data Stream for high-speed data loading and extraction with DSE using Informatica’s proven visual design-driven products for data integration and quality.
Informatica’s Big Data Edition and PowerCenter connector to DSE is certified on DSE version 4.5 or higher and supports CQL (Cassandra Query Language) 3.0, collections, failover to secondary hosts and Unicode.  Informatica’s visual data mapping can be used to bring data from the hundreds of sources Informatica supports, then profile, transform, parse and cleanse the data before loading the curated data sets into DSE for operational systems without writing a line of code. This connector is based on the ODBC 3.52 standard for high-speed connectivity to DSE on-premise as well as hosted in Cloud environments.
Vibe Data Stream support of DSE empowers high speed streaming of operational data for real-time use cases.  Vibe Data Stream is based on Ultra Messaging, the fastest data streaming technology available today, and is configurable through a visual development environment. Vibe Data Stream can power Lambda architecture solutions by loading streaming data into DSE for real-time operations and other data stores in parallel for long-term storage and analytics.
Companies utilizing DSE for critical production systems can use Informatica’s data quality and parsing capabilities to ensure the veracity and formatting of their data. Informatica Big Data Edition executes data integration and quality on Hadoop and enables efficient data preparation for Cassandra-based systems at massive scale. Informatica’s visual parsing and prebuilt parsing libraries enable tool-based transformation of data that is unstructured or has specialized structure to be formatted to work well with DataStax operational database management systems.
Informatica plus DSE can be used to power real-time operational analytics, enterprise search, fraud detection, Internet of Things, transactional systems and other use cases in conjunction with a data warehouse Informatica can pull together, prepare and stage data sets of any size or type for operational systems as well as historical analysis. The combination of Informatica and DSE makes Cassandra data more accessible and accelerates time to production.
Informatica provides a fantastic solution for enterprises and large organizations that want to use proven data integration capabilities with DSE in a manageable way. Using Informatica, risk to new projects is reduced and agility is enhanced through visual mappings that can be rapidly evolved and added to over time. For companies already using Informatica, this new connector makes it especially easy to add DSE to existing data infrastructures.  Informatica makes DSE a more integrated part of enterprise data pipelines to drive value from big data.
Take a look at the short demo of Informatica’s Cassandra connector here:  http://youtu.be/IJSxGEn5hlk

Building an Enterprise Data Hub: Choosing the Data Integration Solution

This blog post was originally posted on the Informatica website on December 16th, 2014 here:

http://blogs.informatica.com/perspectives/2014/12/16/part-3-building-an-enterprise-data-hub-and-choosing-the-data-integration-solution/#fbid=GN4ScbY72Yd
0

Building an Enterprise Data Hub: Choosing the Data Integration Solution

Building an Enterprise Data Hub with proper Data Integration

Building an Enterprise Data Hub
Building an Enterprise Data Hub
Data flows into the enterprise from many sources, in many formats, sizes, and levels of complexity. And as enterprise architectures have evolved over the years, traditional data warehouses have become less of a final staging center for data, but rather, one component of the enterprise that interfaces with significant data flows. But since data warehouses should focus on being powerful engines for high value analytics, they should not be the central hub for data movement and data preparation (e.g. ETL/ELT), especially for the newer data types–such as social media, clickstream data, sensor data, internet-of-things-data, etc.–that are in use today.
When you start seeing data warehouse capacity consumed too quickly and performance degradation where end users are complaining about slower response times, and you risk not meeting your service-level agreements, then it might be time to consider an enterprise data hub (EDH). With an EDH, especially one built on Apache™ Hadoop®, you can plan a strategy around data warehouse optimization to get better use out of your entire enterprise architecture.
Of course, whenever you add another new technology to your data center, you care about interoperability. And since many systems in today’s architectures interoperate via data flows, it’s clear that sophisticated data integration technologies will be an important part of your EDH strategy. Today’s big data presents new challenges as relates to a wide variety of data types and formats, and the right technologies are needed to glue all the pieces together, whether those pieces are data warehouses, relational databases, Hadoop, or NoSQL databases.
Choosing a Data Integration Solution
Data integration software, at a high level, has one broad responsibility: to help you process and prepare your data with the right technology. This means it has to get your data to the right place in the right format in a timely manner. So it actually includes many tasks, but the end result is that timely, trusted data can be used for decision-making and risk management throughout the enterprise. You end up with a complete, ready-for-analysis picture of your business, as opposed to segmented snapshots based on a limited data set.
When evaluating a data integration solution for the enterprise, look for:
  • Ease of use to boost developer productivity
  • A proven track record in the industry
  • Widely available technology expertise
  • Experience with production deployments with newer technologies like Hadoop
  • Ability to reuse data pipelines across different technologies (e.g. data warehouse, RDBMS, Hadoop, and other NoSQL databases)
Trustworthy data
Data integration is only part of the story. When you’re depending on data to drive business decisions and risk management, you clearly want to ensure the data is reliable. Data governance, data lineage, data quality, and data auditing remain as important topics in an EDH. Oftentimes, data privacy regulatory demands must be met, and the enterprise’s own intellectual property must be protected from accidental exposure.
To help ensure that data is sound and secure, look for a solution that provides:
  • Centralized management and control
  • Data certification prior to publication, transparent data and integration processes, and the ability to track data lineage
  • Granular security, access controls, and data masking to protect data both in transit and at the source to prevent unauthorized access to specific data sets
Informatica is the data integration solution selected by many enterprises. Informatica’s family of enterprise data integration, data quality, and other data management products can manage data — of any format, complexity level, or size –from any business system, and then deliver that data across the enterprise at the desired speed.
Watch the latest Gartner video to see Todd Goldman, Vice President and General Manager for Enterprise Data Integration at Informatica, as well as executives from Cisco and MapR, give their perspective on how businesses today can gain even more value from big data.