The Cauvery Water debate — opinions

This was inspired by the controversy around cauvery water debate in mid-september 2016.

Before we begin, I’ll set down my biases and priors and  assumptions:

  1.    I’m from TN, living in bangalore for about 10 years.
  2.    I’m unaware of the actual rain level, agricultural needs, ecological needs and others.
  3.   I’m not going to propose a verdict as much as a process/method to deal with the conflicts that don’t depend on politicians or supreme court.
  4.  I’ve travelled in TN, to most parts in my youth, and trekked to most parts of Karnataka(speaking broken kannada) in the last 10 years,(obviously not as much as TN) and have a grasp on the cultural/mental attitudes in general.

 

One reason I’m ruling out political solution is because we live in a representational democracy.  The way the incentives are in that setup are for the politicians to do what gets them the most votes from the biggest part of their demographics. Trying to expect them to talk to politicians from other state and  come to a compromise is hard because on top of representational democracy, we have a  multi-party system.  Which means, there’s scope for local parties to not care about the interest of the other state parties and people. I’ve seen a few national parties taking contradictory stances, based on which state’s division they are making statements from.  In addition to this incentives, this is a situation with  prisoner’s dilemma type dynamics(i.e: to say, if one agent stops co-operating and defects, then the rest are better off doing the same.).  The only rewards for the politicians in this are media-time and vote bank support .

 

So what I do advocate is a mix of open data and predictive models plus persuasion and media (attention) frenzy that’ll overtake anything like the top-down media stuff the politicians can stir up. It won’t work without both of them, but I have no clue/idea about what will be successful and what will not in the latter case, so will focus majority of the post on the first.

Advocating open data access (water level, population, catchment area, drought area, cultivation area, predicted loss of agricultural area etc….) managed /maintained by a panel of experts, but open to all for debating and opining..

Major points(on the open data front):

  1.    Make the data open and easily  available . Here the data will be catchment areas, agricultural need estimates, actual rainfall, water table levels, water distribution wastage/efficiency, sand mining and their effects on water flow, economic impacts of the water shortage(bankruptcies, loss of revenue, loss of investment etc..). (There are some platforms like this and this already in India)*
  2.    Create/use a open data science platforms let bloggers and volunteers, modify the models (for estimates) and make blogs/predictions based on the given data, but with different models and parameters. (Some tools can be found here and here)
  3. Try to present the models in a way they can be interacted with even by people without programming experience. (The notebook links i provided above need python knowledge to edit, but anything built with this won’t)
  4.  Add volunteers to cross-check some of the data, like sand-mining, rain fall level, etc..
  5. Publish/collaborate with reporters to inform/write stories around the issue, with the help of models.(something with atleast the level of science journalism.)

 

Some thoughts(on the media – based persuasion front):

  1.  Recruit enough people interested in the exercise of figuring out details about impact of the issue.
  2. Make sure you can reach the ones that are currently most likely to indulge in violence(I can only guess at details, but  better targeted marketing strategy is what we need).

 

O.k: Enough of the idealistic stuff.

  1. Will this work? Well the question is too broad. Will it work to bring out the truth.. Ah it can bring us closer to the truth than what  we have.  And more importantly can define/establish a method/platform for us to get closer to data-driven debates and/or arguments.
  2. Will it cut the violence/bandhs/property-damage etc? Well that  boils down to the media and marketing front activism or work done. Leaving that persuasion part to politicians with skewing incentives towards gaining votes(from the steady voting population) is the problem now.  So can we have alternative parties (say,  business owners,) trying to use persuasion tactics only to discourage violence? I don’t know, but it seems likely that violence and martyrdom is preferred mostly by politicians and dons, but not the rest.(say media, local business owners, sheep-zens etc…). So this move has lower expected probability of violence.
  3. Who will pay for all this effort? Ah.. a very pertinent question. The answer is well, it’s going to be hard to pay the costs of even maintaining a information system, not to mention the cost of collecting the data.. That said, I think the big challenge is in the cost of collecting the data, and finding volunteers(something like this in US) to collect it for free. As for the hosting, building and maintaining an information system, I think there can be a cheap way found.
  4.  Is this likely to happen? Haha… no.. not in the next half century or so..
  5. Is there a cheaper way? Ah.. Not in the global/community/country level.. But at the individuals(media/politicians/public(aka u and me) ) sense yes, but it’s not really a cheaper way in the cost it inflicts. May be I’m just not creative enough, feel free to propose one, just be careful to include the costs to others around you now and others to come in the future.(aka your children)
  6. Why will this work? Well apart from the mythical way of saying “Sunlight is the best disinfectant”, I think this approach is basically an ambiguity- reduction approach, which translates to breaking down of status illegibility. (One reason no politician is likely to support this idea.) Status illegibility is the foundation of socio-political machinations and it applies to modern day state politics. So this will raise the probability of something close to a non-violent solution.
  • — I haven’t checked whether these data-sets  are already openly available, but I doubt they are and even if they are, some of the data are estimates, and we would need the models that made the estimates too to be public.

 

UPDATE: A few weeks after this I looked up on the google maps, the path followed by cauvery from origin to it’s end at the sea, and realized, I’ve actually visited more of the places it flows through in Karnataka and a lot fewer in TamilNadu. But that doesn’t change my stance/bias on misuse/abuse of sand mining and lake resources as housing projects in TN as that’s a broader , pervasive and pertinent issue.

 

UPDATE-1: A few months after writing this there was a public announcement, which if you read close enough is a typical persuation-negotiation move, with a specific action(and strong concession, and right now) demanded from the opponent, in exchange for a vague, under-specified promise in the future. This whole thing was on the news, is more support for my thesis that the incentives for politicians are skewed too much towards PR.

 

UPDATE-2:  Some platforms for hosting data, models and code do exist as below(although with different focus):

  1. Kaggle
  2. Drivendata
  3. Crowdai

so the question of collecting, cleaning, verifying and updating data is left.Also here’s a quora answer on the challenges of bootstrapping a data science team, which will be needed for this.

The “when is cheryl’s birthday?” problem — ipython solving steps

Cool, step by step solving of the recently viral when is cheryl’s birthday problem?.
http://ift.tt/1GsacIS

from The “when is cheryl’s birthday?” problem — ipython solving steps

from Tumblr http://ift.tt/1TTBDwS

via IFTTT

from The “when is cheryl’s birthday?” problem — ipython solving steps

from Tumblr http://ift.tt/1NIPglP

via IFTTT

from The “when is cheryl’s birthday?” problem — ipython solving steps

from Tumblr http://ift.tt/1WePwvL

via IFTTT

from The “when is cheryl’s birthday?” problem — ipython solving steps

from Tumblr http://ift.tt/1NITLwP

via IFTTT

from The “when is cheryl’s birthday?” problem — ipython solving steps

from Tumblr http://ift.tt/1OTDyoV

via IFTTT

from The “when is cheryl’s birthday?” problem — ipython solving steps

from Tumblr http://ift.tt/1NIX1Io

via IFTTT

from The “when is cheryl’s birthday?” problem — ipython solving steps
from Tumblr http://ift.tt/1TAPniI
via IFTTT

Stack Ranking

Why stack-ranking is always a case of ‘the house always wins’:

Disclaimer:

1. It’s partly a defendant’s argument, and I am biased towards my client(i.e the employee).

2.I’ve little experience managing a group of people and don’t claim to know all the challenges involved.

3.My research/reading has been restricted to supporting ideas/theories/assumptions only. Not the thorough, covering all other bases(and unbiased) kind of literature survey.(**wink** stack ranking vs performance/vitality curve distinction)

At first look it looks like a wonderful meritocratic setup. It uses relative comparison with peers(not unlike pagerank algorithm/ eigen morality. . On the face of it is a very brilliant idea or a good idea that works well, when measuring quantities that haven’t been quantified before enough. In fact, if I were trying to do science on measuring performance, it’s a reasonably sane and tried approach. However, there are problems with using it.
I understand why it makes it easier to make decisions(especially big organizations), you get a single number that’s guaranteed to fall within some expected values(in the probability theory sense) and forcing a curve simply makes it easier to fit a fixed amount for bonuses and incentives. However, here’s the challenge how do you know your employees’/managers’/directors’ performance falls into a bell curve*? Maybe your company’s hiring practices always get bad performing employees or average performing employees or high performing employees(all three in comparison to the general population)?. In which case, aren’t you alienating a high performing employee, because his peers did better(perhaps in revenue)? The catch is revenue has more factors influencing than the performance of your employees.

Here’s a quote from here. :

“You have to have an objective when you do stuff like this. At GE there was only one objective, and that was to force honesty. That’s all it ever was—to force an honest discussion between your manager and you. And there’s nothing that quite forces that more than employees knowing that they expect to know how that manager ranks them, and then asking that manager, ‘Tell me where I rank and tell me why.’”

See anything wrong in that argument? Try replacing ‘honesty’ with ‘dishonesty’ and the argument still is logically consistent and sounds right. Guess why, because there’s an underlying assumption, stack-ranking raises honesty(or honest communication). While I agree, it’s a good way to force managers to give feedback(especially negative) to their employees, am not convinced it’s good or encourages honesty. I get that people(and managers) are more likely to avoid giving negative feedback and they are also subject to confirmation bias. . All of which can create bloated inefficient departments/teams. Here’s the catch, when you force something like this you’re eventually pushing the lowest ranks to people who are bad negotiators(with their managers) and therefore don’t push back when given negative feedback. Over half a decade or so you get a whole company of employees, who are all very good negotiators(no correlation positive or negative with performance).
In the end that defense sounds way too much like someone (who’s a reformer) and is stuck in the values/virtues node aka holy priest(I know, I’ve been guilty of it so often and probably right now). Enough of debate-level arguments, here’s an attempt at discussion of why it becomes something bad.
In theory, it can encourage managers to be honest to give negative feedback to their hires/employees, but in practice, it comes down to compromises/favours/future promises traded between the employee and the manager. You’re forcing the manager to make compromise/favour/future promise to one employee to pay the other. Even then, if it is still one number and some subjective reasoning between manager and employee it has some hopes of being a measure**. Now that post doesn’t make it clear why it’s a bad idea to use a normal curve on measuring performance, but it’s basic necessity before we can talk about using/finalizing measures of a hitherto unquantified phenomenon. For that we need to understand where does this vitality curve concept comes from.
Here goes the google scholar search result showing up nothing.I’ve been trying to find what research went into the whole stack ranking idea. A google search shows up Vitality curve. Ok where could Jack Welch have picked up this insane idea of vitality curve? The closest I can find is

Central Limit Theorem in Statistics.
The basic premise of this theorem is that if we take enough number of samples of random variable of unknown distribution, the average of the samples will form a normal standard distribution.

This is not the strongest form of the theorem, but is the basic one the rest of the theorems are based on.

Now let’s look at what this means. When you’re examining a measurable quantity, whose distribution is unknown, you can essentially take samples(enough no. of times and enough size) and average it to form a normal distribution, if there’s enough samples and sample sizes.
Why/How is this useful?

Well it becomes useful when you want to compare two random variables and see if they have anything in correlation or common causal factors.

Especially, when you have figured out ways to manipulate/control one of the variables, we can simply design experiments that measure both of these variables, plot the difference of their averages(of the samples) and see how much it varies from the standard normal curve. This can give us whether they are positively or negatively correlated or simply unrelated. This is how experimental sciences work. Ofcourse, it’s not perfect, but it’s the best we have.**

On top of all this it breaks down at a critical assumption of IID***

Now, let’s get back to the original topic, if your organization/manager is implementing a stack ranking and if they refer to central limit theorem(you’re in luck, I haven’t heard any manager relate both of these, or name any of these.) you can question where does their idea of normality comes from. There’ll be cases, where your manager will tell you, your performance was average/below-average/above-average with respect to the rest of the team/organization’s. You get to question, how did they arrive at the normal curve’s values( most likely answer would be past year’s performance).

But here’s the catch, if they understand the experimentation process, the challenge then is to prove/question the current curve has seen enough samples. I don’t think it’s possible in most organizations/most roles. Of course, in very well established industries, with very specifically defined roles, it makes sense and is possible, but I’m not sure it applies well in the modern business environment.

Now the bigger your organization, more likely your performance is rated among different aspects/vectors/areas, which essentially multiplies the number of variables, and actually complicates the problem(requiring more samples to normalize).

What are the basic premises of the “Central Limit Theorem”?
Well, for one that you are comparing two distributions of random variables. (aka random distributions).

* — A quick read based on the blog here suggests not all companies use standard normal distribution, but normal/gaussian distribution with different spreads. facebook seems to have a narrower spread than amazon( which makes me think of the differences in corporate culture and what this model entails for it, but that’s more thinking and perhaps another blog post, about nash equilibrium competition vs co-operation.Hunch/Guess: more competition than co-operation at facebook and vice-versa at amazon.). It’s not clear what google uses.

** — Scientists, don’t get angry with this. I know there are more nuances that go into statistical inferences, but think this is core value/process, and can be explained simply. Besides, am not a real scientist, just a guy who went out of the academics.

*** — Making this assumption about a lot of the variables, I’ve seen used in a performance review is rather comical (like this).

P.S: To put a cynical quip (paraphrasing i think Douglas Adams), The universe is either mildly malevolent or neutral(i.e: definitely not benevolent), the modern workplace is definitely malevolent(either mildly or fatally).

from Stack Ranking

from Tumblr http://ift.tt/1TTC9Lj

via IFTTT

from Stack Ranking

from Tumblr http://ift.tt/1NIPGIY

via IFTTT

from Stack Ranking

from Tumblr http://ift.tt/1WePFPA

via IFTTT

from Stack Ranking

from Tumblr http://ift.tt/1NITQAv

via IFTTT

from Stack Ranking

from Tumblr http://ift.tt/1OTDaGS

via IFTTT

from Stack Ranking

from Tumblr http://ift.tt/1NIWTZw

via IFTTT

from Stack Ranking
from Tumblr http://ift.tt/1TAPB9q
via IFTTT

HTTP protocol.. RFC study notes

Alright, I sholud have done this atleast 2 years ago and was too much of an idiot to not do this, better late than never.

Study Notes — http protocol (RFC 7230 – 7235)*

RFC 7230 — Message syntax and Routing

Key parties:
1. HTTP Server: the sytem that responds to http requets with http responses
2. User Agent/http client: the system that sends the http requests

Intermediaries:
There are some intermediate parties in the communication between 1 and 2. (Because of how tcp/ip works).
Note: these are relevant because, some of the keywords are related to these. (aka, this is where the http vs tcp/ip abstraction leaks)
1. proxy:
message-forwarding agent selected by client(via configurable rules),
commonly used to group an organizations’ requests
2. gateway:
an intermediary that acts as origin(http) server for a outbound connection but translates the requests and forwards them inbound to other servers.
3. tunnel:
Tunnel is a blind relay between 2 connections, that passes on messages. it differs from gateway, but not translating the requests, but blindly passing them on. Generally used in situations like TLS + https secure communication via a firewall proxy

Caches:
Details in RFC 7234.
1. Local store of previous response messages
2. A response may or may not be cached based on :
a, cacheable flag is set.
b, A set of constraints defined in rfc7234

Versioning:
A Message has atleast these fields:
Version is .

HTTP-version = HTTP-name “/” DIGIT “.” DIGIT
HTTP-name = %x48.54.54.50 ; “HTTP”, case-sensitive

Major version denotes http messaging syntax, while minor version is the client’s communication capabilities.
Hmm.. these two don’t seem well-defined so far in the rfc.
My guess is the major version corresponds to tell the server which protocol-specific syntax, (ie: http/https/ftp/etc.) to connect with the server is used for the request.
While minor version is which version client understands, so the response can be formatted in a compatible manner.
My guess about major num is wrong.

The intention of HTTP’s versioning design is that the major number
will only be incremented if an incompatible message syntax is
introduced, and that the minor number will only be incremented when
changes made to the protocol have the effect of adding to the message
semantics or implying additional capabilities of the sender.
However, the minor version was not incremented for the changes
introduced between [RFC2068] and [RFC2616], and this revision has
specifically avoided any such changes to the protocol.

Uniform Resource Identifiers:
1. identifies resources
For the URI syntax, I’ll just quote from the links on the rfc.

URI-reference =
absolute-URI =
relative-part =
scheme =
authority =
uri-host =
port =
path-abempty =
segment =
query =
fragment =

absolute-path = 1*( “/” segment )
partial-URI = relative-part [ “?” query ]

http URI Scheme:

* — Original RFC was 2616 http://ift.tt/1qWngNQ, but it was superseded by these.

from HTTP protocol.. RFC study notes

from Tumblr http://ift.tt/1TJ1z20

via IFTTT

from HTTP protocol.. RFC study notes

from Tumblr http://ift.tt/1YUABVk

via IFTTT

from HTTP protocol.. RFC study notes
from Tumblr http://ift.tt/1RmDqIV
via IFTTT

Pure math.. — Definition for Explain like I’m 5

Why Do We Pay Pure Mathematicians?

Brilliant writing as mathwithbaddrawings always comes up with.

from Pure math.. — Definition for Explain like I’m 5

from Tumblr http://ift.tt/1TTC6PB

via IFTTT

from Pure math.. — Definition for Explain like I’m 5

from Tumblr http://ift.tt/1NIPutn

via IFTTT

from Pure math.. — Definition for Explain like I’m 5
from Tumblr http://ift.tt/1WePPqi
via IFTTT

What I would change about python?

1. The semantics of the ‘or’ keyword. I know it’s supposed to make it readable, as it currently exists(i.e: read boolean values of left side expression, and if false read right side of the expression and return whichever is true. False if both are false.). I’d rather have it return True or False instead. I think that’s more logical for a programmer, and perhaps that’s part of python being not a purely-functional language.

2. The distinction between expression and statement.

3. Side-Effects: While it’s possible to write code that provides functional interface, it(interpreter) does not guarantee no side-effects/assignments.

from What I would change about python?

from Tumblr http://ift.tt/1TJ0M11

via IFTTT

from What I would change about python?

from Tumblr http://ift.tt/1YUBayt

via IFTTT

from What I would change about python?
from Tumblr http://ift.tt/1RmDB70
via IFTTT

Why read fiction?

Why do I read fiction? Or what do I get out of reading fiction?
Vivek haldar here talks about how he doesn’t read fiction because it does nothing to him, or rather means nothing to him.
It set me thinking like a knot in my brain, or a thorn in the brain. I read it long time ago, and my first thought was am the opposite.
I prefer reading fiction. In the time since, I have held the question in my mind for some time and come up with the following possibilities:

0. Theory of mind– there’s some (scanty,debatable)evidence reading fiction helps understanding how other minds work.
Here’s the study
And I do have a tendency to retreat into reading fiction, when I am upset/confused or trying figure out what’s the right decision(usually regarding people in my life) to make.

1. I find it kinda enhances or clears my head to goad into logical thinking.* i.e: once am done reading through the fiction to completion.

2.It definitely affords a comfortable/guilt-free thing to do, without being(nay feeling) guilty of procrastination, supposedly reading is always considered a good thing(socially).

3.It could also simply be my way of dealing with the modern world’s craziness. Much like VGR refers here.

4. It helps as good practice for thought experiments and therefore makes it easier to consider alternative explanations**.

5. It definitely helps to clear out the emotional components from my decision-making/thinking. More specifically in the (alertness/arousal) scale, it helps lowering out arousal level, and therefore raising the alertness/arousal ratio. (One of my hypothesis is that rational thinking directly proportional to ratio of alertness to arousal levels).

*– Might simply be wishful thinking on my part.
P.S: The above is a rather descriptive attempt. Some of the points may and probably do have overlap with other points. The bullet point format is simply organized for communication, instead of empirical hypothesis testing.

from Why read fiction?

from Tumblr http://ift.tt/1TTC6iz

via IFTTT

from Why read fiction?

from Tumblr http://ift.tt/1NIPCJe

via IFTTT

from Why read fiction?
from Tumblr http://ift.tt/1WePGmC
via IFTTT

measure theory and cog. psych.

Measure theory defines three attributes for some variable to be considered a measure*.

  1. Non-negativity:
    It’s the idea that a value should not go to negative when measured (by whatever means/equipment in the real world.
  2. Null empty set:
    It’s the idea that the measure becomes zero for a null set.
  3. Countable additivity:
    This one basically means if there are ‘n’ sets with measures ‘X1, X2, …, Xn’ then the measure of the Union of all the ‘n’ sets is less than or equal to sum of ‘X1, X2, … , Xn’
  • — I think it can be extrapolated/extended to measure of any geometric properties, but not beyond that. Very tellingly, it is used widely in a field called real analysis. After all, in electrical engineering we have all sorts complex, negative, fractional numbers. I picked up these definitons from Fractal Geometry book rather than the wikipedia links provided.

ps: There’s a more generalized definition of measure, that might fit these too here.

** — If you think about these are all just a set of rules for determining if a given set belongs/satisfies the properties of a set of numbers, but that’s beside the point here.

As much as i have been a fan of cognitive psychology, so far, i am now beginnning to wonder, which and how many of these concepts like executive control,,etc.. have been shown to obey these laws. I haven’t done a thorough survey or research, but deeply suspect there hasn’t been any published attempts in these directions. I would like to see some, but i think it may not be easy to pick a property that’s easy enough to deal with.

Also, i begin to wonder how many of these apply or scale to organizational psychology? or committee-centered decision-making policies. Again i suspect there’s been very little attempts to scale/correlate cog,psych concepts into organizational/behavioural psychology, never mind cross-checking with relevant math area’s base assumptions.

from measure theory and cog. psych.

from Tumblr http://ift.tt/1TTCflS

via IFTTT

from measure theory and cog. psych.

from Tumblr http://ift.tt/1NIPppB

via IFTTT

from measure theory and cog. psych.

from Tumblr http://ift.tt/1WePJiq

via IFTTT