Friday, February 17, 2012

JobServer Support on Mac OS X

Grand Logic is happy to announce the release of JobServer 3.4.4. For all those Apple fans, this release provides support for JobServer on Mac OS X. You can now install and deploy JobServer on your favorite Mac. This release includes minor bug fixes.

Download and test drive JobServer 3.4.4 now and learn more about JobServer's powerful developer SDK, soafaces, that makes extending and customizing JobServer and developing custom jobs and backed automated services easier, while using some of the best Java/AJAX and web/SOA open source technology available to developers.

About Grand Logic
Grand Logic is dedicated to delivering software solutions to its customers that help them automate their business and manage their processes. Grand Logic delivers automation software and specializes in mobile and web products and solutions that streamline business.

Tuesday, February 14, 2012

Enterprise Job Scheduling for Big Data & Hadoop

Businesses of all sizes are looking beyond traditional business intelligence taking a more broader approach to BI that goes beyond the traditional data warehouse and operational database technologies of the past. With the explosion of social communication, mobile device data and many other forms of unstructured data coming into focus, businesses are now more interested than ever to ask questions about their data and their customers that they could not ask before.

Hadoop type solutions lets businesses build out this new BI 2.0 type architecture and begin to leverage their data and operations in new ways in order to ask questions that they could not have imagined possible in the past. Hadoop analytics lets businesses ask questions and build reporting solution that effectively leverage massive (yet commodity) processing power and manipulate terabytes of data that where not practical for the average enterprise to do before.

Hadoop provides a broad stack of solutions from cpu/compute clustering, parallel programming, distributed data management, advanced ETL and NoSQL type data management....etc. Hadoop is also moving quickly to build more advanced resource management to allow more efficient job flow processing on larger clusters for the bigger deployments that may have hundreds or thousands of nodes and need to run many jobs concurrently.

Hadoop comes with a few internal capacity type schedulers for managing internal cluster load and resource management, but these are strictly for internal cluster capacity scheduling between nodes and are not functional or calendar based job scheduling tools. Vanilla Hadoop distributions do not include often ecesssary features needed by enterprises to manage and automate the full ecosystem and life-cycle of data processing typically needed by an enterprise to effectively support an end to end BI solution. In most cases an enterprise's IT group must build the necessary infrastructure to smoothly integrate Hadoop into their IT environment and avoid a lot of manual labor and impedance mismatches between their Hadoop operations and their traditional enterprise operations.

This is where JobServer, an enterprise job scheduler, comes into play. JobServer integrates with Hadoop at an enterprise IT level, letting analysts and IT administrators schedule and integrate their IT operations into the Hadoop stack. JobServer leverages a very open and flexible Java plugin API to let Java developers integrate their customizations tightly into JobServer and into Hadoop. Often times what is needed is high level job and workflow automation in order to schedule ETL processing from operational data stores in order to pump data into your Hadoop stack and to schedule jobs to run on regular interval based on business rules and business needs.

JobServer provides the job automation and job scheduling needed to accomplish this, plus it offers key features such as audit-trails to track what jobs where run, when, and edited by whom for example. JobServer, for example, can be used to coordinate and orchastratge a number of Hadoop job flows together into a larger job flow and then take the output and pump it back out into your enterprise reporting systems and enterprise data warehouses. JobServer provides a number of GUI reporting features to let enterprise users from programmers and IT staff to track what is going on in your Hadoop and IT environment and to be alerted quickly of problems.

If you need to tame your Hadoop operations and provide automated and tight integration with your existing IT environment, applications and reporting solutions, give JobServer a look. It can be a great asset to help you run your Big Data operations more efficiently. Visit the JobServer product website for more details.

Contact Grand Logic and see how we can help you make better sense of your Big Data environment. JobServer is also partnering with other Big Data solution providers and major distributions to provide complete Big Data solution for both your in house and cloud Hadoop deployments. Please contact Grand Logic for more information to see how our products can services can make your Hadoop deployment a success.

Tuesday, February 7, 2012

Native Multi-Tenant Hadoop - Big Data 2.0

For Hadoop to gain wider adoption and lower the barrier of entry to a broader audience it must become much more economical for businesses of all sizes to manage and operate a Hadoop processing cluster. Right now it takes a significant upfront investment in hardware and IT knowhow to provision the hardware and the necessary IT admin skills to configure and manage a full blown Hadoop cluster for any significant operation.

Cloud services like Amazon Elastic Map Reduce help reduce some of this but they can quickly become costly if you need to do seriously heavy processing and especially if you need to manage data in HDFS as opposed to constantly moving it between your HDFS cluster and S3 in order to shutdown datanodes to save cost as is the standard with Amazon EMR. Utilities like Whirr also help push the infrastructure management onto the EC2 cloud but again here for serious data processing this can quickly become cost prohibitive.

Operating short lived Hadoop clusters can be q useful option, but many organizations need long running processing and need to leverage HDFS for longer-term persistence as opposed to just a transient storage engine during the lifespan of MapReduce processing as is the case of Amazon EMR. For Hadoop, and Big Data in general, to make the next evolutionary leap for the boarder business world, we need a fully secure and multi-tenant Hadoop platform. In such as multi-tenant environment organizations can share clusters securely and manage the processing load in very controllable ways. And also allow each tenant to customize their Hadoop job flows and code in an isolated manner.

Hadoop already has various capacity management scheduling algorithms but what is needed is higher order resources management that can full isolate between different organizations for HDFS security and data processing purposes to support true multi-tenant capability. This will drive wider adoption within large organizations and by infrastructure services providers because it will increase the efficient utilization of unused CPU and storage just in same way that SaaS has allowed software to achieve greater economies of scale and services and democratize software for small and big organizations alike.

Native multi-tenant support in Hadoop will drastically reduce the upfront cost of rolling out a Hadoop environment and make the long-term costs much more cost effective and open the door for Hadoop and Big Data solutions to go mainstream in much the same way that Salesforce, for example, has created a rich ecosystem of solutions around business applications and CRM. This will also allow organizations to keep long-running environments and keep their data in HDFS for longer periods of time allowing them be more creative and spontaneous.

Thursday, January 12, 2012

End to End Big Data Solution

Grand Logic announces end to end Big Data solution. Our flag ship product, JobServer, and its supporting open source SDKs provide a superior platform for taking your raw data and creating business solutions that will drive ROI and deliver on the promise of Hadoop.

Hadoop is a great solution, but alone it is an island of data processing, algorithms and open source tools. JobServer integrates Hadoop into your enterprise to automate the flow of data and manage ETL processing to efficiently organize and track your Hadoop processing. Then it delivers rich visualization for your Hadoop results to allow you to maximize your business objectives with Big Data. Whether you are targeting mobile, tablets or desktop/web devices, JobServer's powerful GWT based SDK can deliver a rich user experience and visualization for your reports and applications.

All this allows you to manage, monitor and track your Hadoop processing to deliver the control and central management you need to empower your developers and business analysts. JobServer with Hadoop allows you to acquire your data, process it and then visualize it. See this architecture diagram of our end to end JobServer/Hadoop solution stack.



Contact Grand Logic and see how we can help you make better sense of your Big Data environment. JobServer is also partnering with other Big Data solution providers and major distributions to provide complete Big Data solution for both your in house and cloud Hadoop deployments. Please contact Grand Logic for more information to see how our products can services can make your Hadoop deployment a success.

Friday, December 30, 2011

Big Data Predictions for 2012

Well 2011 has been a great year for Hadoop and its supporting ecosystem. There is a growing base of sub projects evolving to fill the many niches in and around Hadoop and there are companies coming out of the wood work to claim their piece of the pie. Not to mention the VC money pouring into Big Data related startups and many established tech players changing their business plans to account for Hadoop. So what can we expect in 2012?

Here are seven predictions for what might be in store for Big Data in 2012:

1) Going Mainstream
Discovering all of what you can do with Big Data analytics in the enterprise is only in its infancy. Right now solutions like Hadoop are the secret weapon of the rich and social who can afford the investment in time, resources and infrastructure. Companies like Facebook and Twitter are using solutions like Hadoop to do things not possible before with traditional relational BI and analytics solutions. We will see in 2012 the window widening with more traditional enterprises seeing the potential benefits that Hadoop analytics can offer. We will see more companies in various industries look to leverage Hadoop to ask questions about their operations and customers not possible before. Look for Hadoop to go more mainstream and loose some of that exoticness that currently relegates it only to the big boys.

2) Put it in the Cloud
The barrier of entry is lowering for Hadoop with players like Amazon offering low cost of entry platforms for initial Hadoop deployments. In 2012 we will see continued acceptance of using the Cloud as the infrastructure of choice for deploying your Hadoop. With Amazon and others improving virtual private network services it will make integrating private Cloud solutions for Hadoop more palatable for security conscious enterprises. Cloud will be the target platform of choice for Hadoop in 2012. This will also open the door for smaller enterprises to dip their toe into Hadoop to discover what they have been missing in their volumes of consumer and operational data warehouses.

3) Automation and Integration
Right now most Hadoop deployments are islands of data and processing infrastructure. In the coming year we will see more tech companies begin to offer better tools to enable businesses to tie their back office data stores and data warehouses with their Hadoop environments in a more seamless fashion. Efficiently moving customer and business data out of traditional data stores such as relational databases, and processed and prepared for Hadoop consumption will be critical for successful Hadoop deployments. We anticipate a new category of ETL that will be focused on the management of data movement in and out of Hadoop and HDFS. This will gain more traction in 2012. There are already Hadoop projects focusing on related areas and we will see more Hadoop type connectors popping up from traditional software vendors eager to get their products integrated with Hadoop.

4) Analytics and Visualization
Traditional BI reporting tools are not geared well toward the type of output generated from Big Data type environments. A new breed of reporting tools and analytics solutions will emerge to better consume the output coming out of Big Data systems. Look for many traditional BI vendors to begin to tailor their front-end reporting solutions to fit with Hadoop and distributed data stores including NoSQL type of data stores. But much of what traditional BI vendors will offer will not be a natural fit since most of the BI vendors and their tools are more comfortable dealing with highly structured data. Also as business analytics in companies start to get a taste of the kind of problems that can now be solved with Big Data, that were not possible before, they will begin to think of new problems to ask that will drive the need for more visualization and reporting of the data coming out of Big Data. So keep an eye out for startups and tech companies offering Big Data native analytics solutions tailored from the ground up for visualizing the statistical kinds of data coming out of Hadoop. Turning statistical questions, common when dealing with Big Data, into visual reports that can be understood by business users will be a big leap forward to turning the raw data in many enterprises into meaningful value and actionable results.

5) Going Mobile
We will see in 2012 apps and solutions that allow business users to get a glimpse of their Hadoop operations and resulting output presented on mobile and tablet devices. This one is not a big stretch considering the growing proliferation of mobile computing. But look for Hadoop to get a bit more mobile in 2012. Visualization of BI on mobile is natural trend and Big Data is no exception.

6) Going Vertical and Healthcare
Healthcare is the perennial elephant in the room when it comes to needing operational efficiency improvements and managing exploding volume of patient data (not to mention making sense of patient data). From both the billing dimension and the diagnostic patient data aspect, healthcare will benefit greatly from the type of problems that Big Data can solve. In 2012 we will see healthcare providers and healthcare IT companies begin to seriously invest in Big Data to help them solve problems not possible before with traditional healthcare IT. Look for healthcare providers to tap Hadoop to better understand their patients inorder to deal with the volumes of digital patient data and to help them deal with government regulations and compliance.

7) Real-Time Big Data?
This might be a stretch, but look for some early signs of various tech player looking to deliver more real-time business solutions around Big Data. Hadoop brings tremendous processing power to bear to solve problems that were not practical before. With computing power growing and virtualization easer to manage and deploy, look for business users to demand Big Data type problems to be solved in more near real-time situations. This will open the door for even more interesting applications of Big Data for business and even end consumers.

Let's regroup in twelve months and see how well these predictions panned out :)

Wednesday, December 21, 2011

Big Data is more than Map/Reduce

Companies of all sizes are looking for ways to make sense of their unstructured data. Data is growing at tremendous volumes and keeping track of it is becoming more challenging and expensive. Enter solutions such as Hadoop that allow you to make sense of this data using a highly distributed architecture that is based on horizontal scaling among other things.

Products like Hadoop are a critical layer of the solution, but they solve only one part of the overall tapestry needed to make sense of your Big Data. There are two key areas that are critical to your overall Big Data solution.

Imagine your Hadoop environment sitting in the center. Data must obviously be fed into it and data will flow out. An important aspect of a successful Hadoop deployment is managing these inputs and output points. Efficient management of your data as it comes in raw and then leaves as much more easily to consume information is key to having a successful Big Data environment.

Data In - Preparing your Big Data
First, you need to prepare your data for processing by Hadoop. Unstructured data must be prepared and loaded and sometimes integrated with relational data from enterprise database sources, for example. All this must be automated and fed into your Hadoop environment. This is a critical step especially as companies get into more real-time Big Data processing. The need for real-time analysis is becoming more critical with the explosion of social information and online commerce.

This requires automation to feed and prepare your Hadoop environment to minimize manual labor and many potentially error prone steps. Having this fully automated with as minimal human intervention is critical. Solutions such as JobServer, make this automation much more manageable. JobServer is ideal for integrating with your back office databases and can meld well with your IT environment using solutions such as SOA, Mule and ETL just as a few examples. JobServer can be used to centralize the logic and management for this preparation work so that steps like loading data into your HDFS are all automated easy to monitor and track.

Data Out - Data Analytics
As large amounts information are extracted out of solutions like Hadoop, visualizing the results is critical. Here to is where JobServer and its soafaces developer API can step in to address this challenge. soafaces is based on an open source API that allows for building rich reporting and analytic solutions that can capture data as it is coming out of your Hadoop environment and visualizing it in an easy to manage way. soafaces is based on the Google GWT framework for GUI development and can support rich web-based reporting and graphing technologies to provide for rich visualization of your results. JobServer also can be leveraged here to show, in a very organized manner, the results of each Hadoop job run and can be used to provide access to the final results to the right people in your organization through web reports, spreadsheets and email alerts...etc.

Contact Grand Logic and see how we can help you make better sense of your Big Data environment. JobServer is also partnering with other Big Data solution providers and major distributions to provide complete Big Data solution for both your in house and cloud Hadoop deployments.

Please contact Grand Logic for more information.

Wednesday, December 7, 2011

Tame your Hadoop - Hadoop Professional Services

Grand Logic is pleased to announce our expanded consulting services specializing in Hadoop solutions. Hadoop is quickly becoming the tool of choice for Big Data analytics and our expertise with Hadoop and applying our tools like JobServer (job workflow/scheduling engine) and soafaces (open source framework) enable us to build complete enterprise solutions around Hadoop for our customers.

Hadoop comes with many great supporting modules but needs additional tools and features to make it easy to manage and organize all the activity and content around your Hadoop operations. With our consulting services we can quickly come in and build a management and integration layer to help you automate and manage your Hadoop deployment. This will allow you to tie your back office data and IT systems to feed data to your Hadoop operations to automate and streamline data movement between your business data and your Hadoop analytics. This is vital to having a successful ROI for Hadoop. If you can't efficiently feed data into Hadoop and extract it (and visualize it) your Hadoop number crunching will be for not. With JobServer as part of your Hadoop environment and our expertise and professional services you will be able to:
  • Effectively build custom ETL processing between your back office data and your Hadoop data stores
  • Build, package and reuse custom logic and server-side tasks for managing and editing your Hadoop jobs
  • Compose complex Hadoop workflows from multiple simpler Hadoop jobs (and non-Hadoop jobs) to build support for rich scenarios where data is moved between HDFS and local storage and between multiple Hadoop jobs.
  • Integrate easily with modules like Cascading and Pig.
  • Security on job by job basis to restrict all aspects of job configuration, monitoring, reporting and execution on user by user basis.
  • Application permissions - control what tools are available to which users.
  • Detailed alerting (via email or sms) to report on status and failures by jobs and by job groups.
  • Organize jobs into groups and partitions for user organization resource management.
  • Powerful job scheduling - any scheduling pattern you can think of for running your Hadoop jobs can be be built by our consulting team.
  • Easily create custom reports per Hadoop job. Using the soafaces framework we can build custom reports that let you view the status and results of each of your Hadoop jobs.
  • We are experts with soafaces and GWT and can build rich visualization for your Hadoop generated data. And we can help you visualize this on web, mobile and tablets devices to develop rich analytics that can be consumed by decision makers in your company.
So if you want to tame your Hadoop environment, Grand Logic has the technical expertise and tool chest of tools and frameworks to get you going and organized. Contact us today to see what we can do for you.