讲座会议

    有效置信水平的统计、理论和实用性思考

    时间:2019-10-12 打印

     

    瑞典隆德大学 Frank Zenker教授

    The effect size’s statistical, theoretical, and practical aspects*

    (有效置信水平的统计、理论和实用性思考)

    光华讲坛——推理、论证与传播系列讲座

    题:The effect size’s statistical, theoretical, and practical aspects*

    有效置信水平的统计、理论和实用性思考

    主讲人:Frank Zenker教授

    主持人:吴晓静

    间:2019101410:00

    点:通博楼C120图书资料室

    主办单位:人文学院 科研处

     

    主讲人简介:

    Frank Zenker,现任瑞典隆德大学哲学与认知科学系研究员、德国康斯坦茨大学哲学系研究员和斯洛伐克科学院哲学研究所研究员,中山大学逻辑与认知研究所访问教授。其主要研究领域为认知科学、科学哲学、论证理论等。

    内容提要:

    An empirical effect’s overall significance, we submit, depends on its statistical, its theoretical, and its practical aspect—that is: the effect’s size; how well a theory predicts the effect; and the utility ascribed to it. (An effect’s size here measures the causal influence or correlation between an experimental setting’s independent and dependent variables.) To evaluate the effects that social science-journals among others areas today report typically, it its sound advice to treat each aspect (analytically) separate. Under the statistical aspect, indeed, one cannot so much evaluate, but can rather only describe an effect: the statistical effect size is a parameter estimate. Under the theoretical aspect, one must evaluate the fit between a theoretical prediction and empirical observations, using two error-pairs—α, β and γ, δ—to quantify the remaining uncertainty in, respectively, corroborating and applying a theory. Under the practical aspect, if an effect’s utility is measured as its expected value, then even a very small effect, of little theoretical importance, too can nevertheless have high practical importance.

    The current literature more or less displays research communities engaged in an “effect chase.” These communities, we argue, should move past the current praxis of collecting and publishing small non-random deviations, and onto the development of systematic research programs. Such programs should feature theories postulating effects that are much larger than the small effects which 3rd-order meta-analyses mostly report today. To defend these as importance, one must cite the practical aspect. (The goal of “making theoretical progress” alone does not suffice.) To corroborate small effects, moreover, takes samples much larger than what is typical today. Conversely, if we based a practical decision on what current (reported) empirical settings and theories tend to corroborate for individual behavior, then this decision would probably outperform a fair coin-toss but slightly. In the specific case of evaluating a theory by means of data, finally, one cannot simply treat (the size of) the ‘theoretically modelled average participant reaction’ as a stand-in for their observed individual reactions.

    一个实证效应的总体意义应该取决于其统计意义、理论意义以及现实意义:即效应的大小(自变量和因变量之间的因果影响或相关性);理论预测效果的好坏;以及它的效用。因此,为了评估当今社会科学期刊以及其他领域通常报告的效应,应该将这几个方面分开看待。在统计层面,我们在统计方面不能做过多的评估,而只能描述一个效果——统计效果大小是一个参数估计。 在理论层面,必须使用两个误差对:α,β以及γ,δ评估理论预测值和实际观测值之间的拟合程度,以分别量化证实理论和理论的应用过程中剩下的不确定性。在现实意义方面,如果一个效果的效用是用它的期望值来衡量的,那么即使是一个非常小的、在理论上重要性很小的效果,也可以具有很强的现实意义。

    已有文献或多或少显示了从事“效应追逐”的研究社区。我们认为,这些社区应该超越目前收集和发布小的非随机偏差的做法,转向系统性研究项目的开发。这样的项目应该以假设效应的理论为特征,这些假设效应要远远大于目前大多数三阶元分析报告的小效应。要为这些重要性辩护,就必须举出实际方面。(光有“理论进步”的目标是不够的。)此外,为了证实小的效应,需要比现在典型的样本大得多的样本。相反,如果我们基于当前的经验设置和理论所倾向的个人行为来做出实际决定,那么这个决定可能会比抛硬币的结果更好,但只是稍微好一点。最后,在用数据评估理论的特定情况下,我们不能简单地将“理论所模拟的平均参与者反应”作为观察到的个体反应的替代。