天天网官方

  • Databricks, Spark and BDAS

    Discussion of BDAS (Berkeley Data Analytics Systems), especially Spark and related projects, and also of Databricks, the company commercializing Spark.

    August 17, 2017

    More notes on the transition to the cloud

    Last year I posted observations about the transition to the cloud. Here are some further thoughts.

    0. In case any doubt remained, the big questions about transitioning to the cloud are “When?” and “How?”. “Whether”, by way of contrast, is pretty much settled.

    1. The answer to “When?” is generally “Over many years”. In particular, at most enterprises the cloud transition will span multiple CIO’s tenure in their positions.

    Few enterprises will ever execute on simple, consistent, unchanging “cloud strategies”.

    2. The SaaS (Software as a Service) vs. on-premises tradeoffs are being reargued, except that proponents now spell SaaS C-L-O-U-D. (Ali Ghodsi of Databricks made a particularly energetic version of that case in a recent meeting.)

    3. In most countries (at least in the US and the rest of the West), the cloud vendors deemed to matter are Amazon, followed by Microsoft, followed by Google. And so, when it comes to the public cloud, Microsoft is much, much more enterprise-savvy than its key competitors.

    Read more

    August 10, 2017

    Notes on data security

    1. In June I wrote about burgeoning interest in data security. I’d now like to add:

    We can reconcile these anecdata pretty well if we postulate that:

    2. My current impressions of the legal privacy vs. surveillance tradeoffs are basically: Read more

    June 30, 2017

    Analytics on the edge?

    There’s a theory going around to the effect that:

    There’s enough truth to all that to make it worth discussing. But the strong forms of the claims seem overblown.

    1. This story doesn’t even make sense except for certain new classes of application. Traditional business applications run all over the world, in dedicated or SaaSy modes as the case may be. E-commerce is huge. So is content delivery. Architectures for all those things will continue to evolve, but what we have now basically works.

    2. When it comes to real-world appliances, this story is partially accurate. An automobile is a rolling network of custom Linux systems, each running hand-crafted real-time apps, a few of which also have minor requirements for remote connectivity. That’s OK as far as it goes, but there could be better support for real-time operational analytics. If something as flexible as Spark were capable of unattended operation, I think many engineers of real-world appliances would find great ways to use it.

    3. There’s a case to be made for something better yet. I think the argument is premature, but it’s worth at least a little consideration.? Read more

    June 16, 2017

    Generally available Kudu

    I talked with Cloudera about Kudu in early May. Besides giving me a lot of information about Kudu, Cloudera also helped confirm some trends I’m seeing elsewhere, including:

    Now let’s talk about Kudu itself. As I discussed at length in September 2015, Kudu is:

    Kudu’s adoption and roll-out story starts: Read more

    June 14, 2017

    Cloudera Altus

    I talked with Cloudera before the recent release of Altus. In simplest terms, Cloudera’s cloud strategy aspires to:

    In other words, Cloudera is porting its software to an important new platform.* And this port isn’t complete yet, in that Altus is geared only for certain workloads. Specifically, Altus is focused on “data pipelines”, aka data transformation, aka “data processing”, aka new-age ETL (Extract/Transform/Load). (Other kinds of workload are on the roadmap, including several different styles of Impala use.) So what about that is particularly interesting? Well, let’s drill down.

    *Or, if you prefer, improving on early versions of the port.

    Read more

    April 13, 2017

    Analyzing the right data

    0. A huge fraction of what’s important in analytics amounts to making sure that you are analyzing the right data. To a large extent, “the right data” means “the right subset of your data”.

    1. In line with that theme:

    2. Business intelligence interfaces today don’t look that different from what we had in the 1980s or 1990s. The biggest visible* changes, in my opinion, have been in the realm of better drilldown, ala QlikView and then Tableau. Drilldown, of course, is the main UI for business analysts and end users to subset data themselves.

    *I used the word “visible” on purpose. The advances at the back end have been enormous, and much of that redounds to the benefit of BI.

    3. I wrote 2 1/2 years ago that sophisticated predictive modeling commonly fit the template:

    That continues to be tough work. Attempts to productize shortcuts have not caught fire.

    Read more

    March 12, 2017

    Introduction to SequoiaDB and SequoiaCM

    For starters, let me say:

    Also:

    Unfortunately, SequoiaDB has not captured a lot of detailed information about unpaid open source production usage.

    Read more

    December 18, 2016

    Introduction to Crate.io and CrateDB

    Crate.io and CrateDB basics include:

    In essence, CrateDB is an open source and less mature alternative to MemSQL. The opportunity for MemSQL and CrateDB alike exists in part because analytic RDBMS vendors didn’t close it off.

    CrateDB’s not-just-relational story starts:

    Read more

    November 23, 2016

    DBAs of the future

    After a July visit to DataStax, I wrote

    The idea that NoSQL does away with DBAs (DataBase Administrators) is common. It also turns out to be wrong. DBAs basically do two things.

    • Handle the database design part of application development. In NoSQL environments, this part of the job is indeed largely refactored away. More precisely, it is integrated into the general app developer/architect role.
    • Manage production databases. This part of the DBA job is, if anything, a bigger deal in the NoSQL world than in more mature and automated relational environments. It’s likely to be called part of “devops” rather than “DBA”, but by whatever name it’s very much a thing.

    That turns out to understate the core point, which is that DBAs still matter in non-RDBMS environments. Specifically, it’s too narrow in two ways.

    My wake-up call for that latter bit was a recent MongoDB 3.4 briefing. MongoDB certainly has various efforts in administrative tools, which I won’t recapitulate here. But to my surprise, MongoDB also found a role for something resembling relational database design. The idea is simple: A database administrator defines a view against a MongoDB database, where views: Read more

    October 21, 2016

    Rapid analytics

    “Real-time” technology excites people, and has for decades. Yet the actual, useful technology to meet “real-time” requirements remains immature, especially in cases which call for rapid human decision-making. Here are some notes on that conundrum.

    1. I recently posted that “real-time” is getting real. But there are multiple technology challenges involved, including:

    2. In early 2011, I coined the phrase investigative analytics, about which I said three main things: Read more

    Next Page →

    Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

    Login

    Search our blogs and white papers

    Monash Research blogs

    User consulting

    Building a short list? Refining your strategic plan? We can help.

    Vendor advisory

    We tell vendors what's happening -- and, more important, what they should do about it.

    Monash Research highlights

    Learn about white papers, webcasts, and blog highlights, by RSS or email.

  • 索莱尔/注册就送18

    ag88环亚

    九州体育客户端

    电竞|体育|投注

    惠赢竞彩app

    竞猜平台

    lol比赛竞猜平台

    乐福彩票平台官方网站

    澳门网上娱乐排行榜